From owner-freebsd-hackers@freebsd.org  Wed Apr 12 02:32:29 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3E495D3A59C
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 02:32:29 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com
 [IPv6:2607:f8b0:400e:c00::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 0E24BBC9
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 02:32:29 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pf0-x242.google.com with SMTP id c198so2482863pfc.0
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 19:32:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=JpJKDgntHWxojUVtXwn4X9vTiZKDsMvcHbvnhPGm9Lk=;
 b=TARhOEexxZhw3TGDnXFx/XwLmyLp1+sSKwmIkN9E3a7BJdepmNCD/xS3QALEf0gXcX
 4KSVbpHZ1tzCkqsneJpe4OmxTlIdIDWqc7EqftY860WYQ3k0E+2O4on0IciGTDl3Knm8
 TwU9VOQcntr47eHq8GvFyqHUkLtQN1GgujLVdZA1DiZAme5mcbM09I5o3WqlP15xItLB
 ytMTNde4w782ALEashNHOomhgBqBqf/FnZeKoCZLsO3Ok9IvNxSLlcbs6lu39ip2kVaY
 HchglRzjDo7UxQko5BUdpHC7Eun1Q6QbklRmdbqiCLYjXCmr/3pTlMLarw3K7+ILtQ4V
 JtFQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=JpJKDgntHWxojUVtXwn4X9vTiZKDsMvcHbvnhPGm9Lk=;
 b=ZNvVAnBGn9GnlUVQyQGFBB/1hmNiUjG3jsiXvRBGOXIawKrLwITKQfv6b+N9ZjuO/N
 L1VFMNighpBW1A5aiA2Hlw5TB306o0KHSR7FyRVSRDV60eZfOb9Ki5IqS2RPv6UyBiX+
 QVcJRoULkhJUArQu9UhKAElmcazImwNBJv+ZI10IlvkrwkrK+wiXyEjwVL2njiCCWvH3
 a9B8zpF8LcCLopFwmeqlynBrM1cc4thO3RSbdkM/sU4nvdL0PN6if+gWhXxEmjoegKlw
 1q+ijWYi+oGWMs00qKggRkOaa8VMNYf0uI5WUdLIrwP/zv0i7n/QyXtf0KSAV3svt078
 Tr+A==
X-Gm-Message-State: AFeK/H189taxi12rLIYis3AzSfSELyd6HjAWtWeQFDGLalQjpQ3QqYFZxjxPSrs3EE8z2A==
X-Received: by 10.99.94.66 with SMTP id s63mr62735071pgb.34.1491964348499;
 Tue, 11 Apr 2017 19:32:28 -0700 (PDT)
Received: from [192.168.2.211] ([116.56.129.146])
 by smtp.gmail.com with ESMTPSA id r17sm32995928pgg.19.2017.04.11.19.32.25
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 19:32:27 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704112311.v3BNB4fc094085@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <99e3673e-d490-faef-359d-c6ec8a36ee0c@gmail.com>
Date: Wed, 12 Apr 2017 10:32:18 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704112311.v3BNB4fc094085@elf.torek.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 02:32:29 -0000

On 2017年04月12日 07:11, Chris Torek wrote:
>> The difference between the "ithread" and "interrupt filter" things
>> is that ithread has its own thread context, while interrupt handling
>> through interrupt filter shares the same kernel stack.
>
> Right -- though rather than "the same" I would just say "shares
> a stack", i.e., we're not concerned with *whose* stack and/or
> thread we're borrowing, just that we have one borrowed.
>
>> So, for ithread, we should use the MTX_DEF, which don't disable
>> interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
>> disable interrupt.
>
> Right.
>
>> What really confuses me is that I don't really see how owning an
>> "independent" thread context(i.e ithread) makes a thread run in the
>> "top-half" and how sharing the same kernel stack makes a thread run in
>> the "bottom-half".
>
> It's not that it *makes* it run that way, it's that it *allows* it
> to run that way -- and then the scheduler *does* run it that way.
>
>> I did read your long explanation in the previous mail. For the non-SMP
>> case, the "top-half/bottom-half" model goes well and I understand how
>> the *code* path/*data* path things go. But I cannot still fully
>> understand the model for the SMP case.
>
> It's fundamentally fairly tricky, but we start with that same first
> notion:
>
>  * If you have your own state (i.e., stack), you can be suspended
>    (stopped in the scheduler, giving the CPU to other threads):
>    *your* (private) state is preserved on *your* (private) stack.
>
>  * If you have borrowed someone else's state, anything that suspends
>    you, suspends them too.  Since this may deadlock, you are not
>    allowed to do it at all.

clear. How can I distinguish these two conditions? I mean, whether I
am using my own state/stack or borrowing others' state.

> Once we block interrupts locally (as for MTX_SPIN, or
> automatically inside a filter style or "bottom half" interrupt),
> we are in a special state: we may not take *any* MTX_DEF locks at
> all (the kernel should panic if we do).
>
> This in turn means that data structures are protected *either* by
> a spin mutex *or* by a default (non-spin) mutex, never both.  So
> if you need to touch a spin-mutex data structure from thread-y
> ("top half") code, you obtain the spin mutex, and now no interrupts
> will occur *on this CPU*, and as a key side effect, you won't move
> *off* this CPU either.  If an interrupt occurs on another CPU and
> it goes to take the spin lock that protects that CPU, it loops
> at that point, not switching tasks, waiting for the MTX_SPIN mutex
> to be released:
>
>        CPU 1                          CPU 2
>     ----------------------------|-----------------------------
>     func() {                    | ... code not involving mtx
>         mtx_lock_spin(&mtx);    | ...
>         do some work            |    mtx_lock_spin(&mtx); /* loops */
>              .                  |        [stuck]
>              .                  |        [stuck]
>              .                  |        [stuck]
>        mtx_unlock_spin(&mtx);   |        [unstuck]
>              ...                |        do some work
>
> If an interrupt occurs on CPU 2, and that interrupt-handling code
> wants to touch the data protected by the spin lock, that code
> obtains the spin lock as usual.  Meanwhile the interrupt *cannot*
> occur on CPU 1, as holding the spin lock has blocked interrupts.
> So the code path on CPU 2 blocks -- looping in mtx_lock_spin(),
> not giving CPU 2 over to the scheduler -- for as long as CPU 1
> holds the spin lock.  The corresponding code path is already
> blocked on CPU 1, the same way it was back in the non-SMP, single-
> CPU days.

Things become clearer now. Thanks for your reply.
If I understand correctly, which kind of lock should be used depends on
which thread model(i.e "thread filter" or "ithread") we use. If I want
to use a lock, I must know in advance which kind of thread model I am
in, otherwise the interrupt handling code might cause you deadlock or
kernel panic. The problem is, how can I tell which thread model I am
in? I am not so clear about the thread model things and scheduling
code of FreeBSD...

> This means it is unwise to hold spin locks for long periods.  In
> fact, if CPU 2 waits too long in that [stuck] section, it will
> panic, on the assumption that CPU 1 has done something terrible
> and the system is now hung.
>
> This is also waht gives rise to the constrant that you must take
> MTX_SPIN locks "inside" any outer MTX_DEF locks.

What do you mean by "must take MTX_SPIN locks 'inside' any outer
MTX_DEF locks?

Regards,
Yubin Ruan