Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Apr 2017 10:32:18 +0800
From:      Yubin Ruan <ablacktshirt@gmail.com>
To:        Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
Cc:        ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
Subject:   Re: Understanding the FreeBSD locking mechanism
Message-ID:  <99e3673e-d490-faef-359d-c6ec8a36ee0c@gmail.com>
In-Reply-To: <201704112311.v3BNB4fc094085@elf.torek.net>
References:  <201704112311.v3BNB4fc094085@elf.torek.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017年04月12日 07:11, Chris Torek wrote:
>> The difference between the "ithread" and "interrupt filter" things
>> is that ithread has its own thread context, while interrupt handling
>> through interrupt filter shares the same kernel stack.
>
> Right -- though rather than "the same" I would just say "shares
> a stack", i.e., we're not concerned with *whose* stack and/or
> thread we're borrowing, just that we have one borrowed.
>
>> So, for ithread, we should use the MTX_DEF, which don't disable
>> interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
>> disable interrupt.
>
> Right.
>
>> What really confuses me is that I don't really see how owning an
>> "independent" thread context(i.e ithread) makes a thread run in the
>> "top-half" and how sharing the same kernel stack makes a thread run in
>> the "bottom-half".
>
> It's not that it *makes* it run that way, it's that it *allows* it
> to run that way -- and then the scheduler *does* run it that way.
>
>> I did read your long explanation in the previous mail. For the non-SMP
>> case, the "top-half/bottom-half" model goes well and I understand how
>> the *code* path/*data* path things go. But I cannot still fully
>> understand the model for the SMP case.
>
> It's fundamentally fairly tricky, but we start with that same first
> notion:
>
>  * If you have your own state (i.e., stack), you can be suspended
>    (stopped in the scheduler, giving the CPU to other threads):
>    *your* (private) state is preserved on *your* (private) stack.
>
>  * If you have borrowed someone else's state, anything that suspends
>    you, suspends them too.  Since this may deadlock, you are not
>    allowed to do it at all.

clear. How can I distinguish these two conditions? I mean, whether I
am using my own state/stack or borrowing others' state.

> Once we block interrupts locally (as for MTX_SPIN, or
> automatically inside a filter style or "bottom half" interrupt),
> we are in a special state: we may not take *any* MTX_DEF locks at
> all (the kernel should panic if we do).
>
> This in turn means that data structures are protected *either* by
> a spin mutex *or* by a default (non-spin) mutex, never both.  So
> if you need to touch a spin-mutex data structure from thread-y
> ("top half") code, you obtain the spin mutex, and now no interrupts
> will occur *on this CPU*, and as a key side effect, you won't move
> *off* this CPU either.  If an interrupt occurs on another CPU and
> it goes to take the spin lock that protects that CPU, it loops
> at that point, not switching tasks, waiting for the MTX_SPIN mutex
> to be released:
>
>        CPU 1                          CPU 2
>     ----------------------------|-----------------------------
>     func() {                    | ... code not involving mtx
>         mtx_lock_spin(&mtx);    | ...
>         do some work            |    mtx_lock_spin(&mtx); /* loops */
>              .                  |        [stuck]
>              .                  |        [stuck]
>              .                  |        [stuck]
>        mtx_unlock_spin(&mtx);   |        [unstuck]
>              ...                |        do some work
>
> If an interrupt occurs on CPU 2, and that interrupt-handling code
> wants to touch the data protected by the spin lock, that code
> obtains the spin lock as usual.  Meanwhile the interrupt *cannot*
> occur on CPU 1, as holding the spin lock has blocked interrupts.
> So the code path on CPU 2 blocks -- looping in mtx_lock_spin(),
> not giving CPU 2 over to the scheduler -- for as long as CPU 1
> holds the spin lock.  The corresponding code path is already
> blocked on CPU 1, the same way it was back in the non-SMP, single-
> CPU days.

Things become clearer now. Thanks for your reply.
If I understand correctly, which kind of lock should be used depends on
which thread model(i.e "thread filter" or "ithread") we use. If I want
to use a lock, I must know in advance which kind of thread model I am
in, otherwise the interrupt handling code might cause you deadlock or
kernel panic. The problem is, how can I tell which thread model I am
in? I am not so clear about the thread model things and scheduling
code of FreeBSD...

> This means it is unwise to hold spin locks for long periods.  In
> fact, if CPU 2 waits too long in that [stuck] section, it will
> panic, on the assumption that CPU 1 has done something terrible
> and the system is now hung.
>
> This is also waht gives rise to the constrant that you must take
> MTX_SPIN locks "inside" any outer MTX_DEF locks.

What do you mean by "must take MTX_SPIN locks 'inside' any outer
MTX_DEF locks?

Regards,
Yubin Ruan





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99e3673e-d490-faef-359d-c6ec8a36ee0c>