From owner-freebsd-arch@FreeBSD.ORG  Fri Sep 13 13:56:02 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id A81F9DBB;
 Fri, 13 Sep 2013 13:56:02 +0000 (UTC)
 (envelope-from dkandula@gmail.com)
Received: from mail-wi0-x22a.google.com (mail-wi0-x22a.google.com
 [IPv6:2a00:1450:400c:c05::22a])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id DDB8A2467;
 Fri, 13 Sep 2013 13:56:01 +0000 (UTC)
Received: by mail-wi0-f170.google.com with SMTP id cb5so1047049wib.3
 for <multiple recipients>; Fri, 13 Sep 2013 06:56:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=V/6w+2A3kj0W8nsj+pRc88Zeqq3yqpepweCx6Lf1MLQ=;
 b=QX66MR3mFRM1BJeIwB/D+0fFrwSSsyDJpJeXYxnC/DZoQc4YN0AMLcOSAa+kMtHLfe
 pnPm8o7A5XFQueY+OjoP00pZgnFlSLpX2Kw3WZDfyWAPMbWRJ/fe8B2VvICFYeHphb/7
 fXANvIHl9441C/KTY66C1XcSbBMGmoHEvUJWG/tXHJmKaC2arVo1i2DMs8vc/mfYldGQ
 LQOC86HckB9HEAwKppaTCBWPWsJ/3x4H6uCNikhxUgdiHifrgJ1StOZPtsL5DsFrFglF
 8r6HkkUmMhYgwCyuXxaBOE2We9cTjIMDm7j3FP4KeKMJujHV+ayZNpNAjisU14tG1tju
 ShWw==
MIME-Version: 1.0
X-Received: by 10.180.13.174 with SMTP id i14mr2659371wic.49.1379080560197;
 Fri, 13 Sep 2013 06:56:00 -0700 (PDT)
Received: by 10.194.38.167 with HTTP; Fri, 13 Sep 2013 06:56:00 -0700 (PDT)
In-Reply-To: <52329012.2050408@freebsd.org>
References: <CA+qNgxSVkSi88UC3gmfwigmP0UCO6dz+_Zxhf_=URK7p4c-Ghg@mail.gmail.com>
 <CAFHCsPXJkxvJrhfbZt5T=Bm=ZS8-+E9xL1cY7b6UENHJ74YR5Q@mail.gmail.com>
 <CA+qNgxT68eobU+G4AjKeU6wZb0xM_sktDdQ=jCcmYyzQR+asiw@mail.gmail.com>
 <201309120824.52916.jhb@freebsd.org>
 <FAF0B30B-0F54-43F6-9239-AC0CC64AC955@mu.org>
 <CA+qNgxS3Sm+TWfEXGhr=9KxAgGtx4pp3deO=Wm=PeZMbgf9piw@mail.gmail.com>
 <52329012.2050408@freebsd.org>
Date: Fri, 13 Sep 2013 09:56:00 -0400
Message-ID: <CA+qNgxSFsVeeHk+SUPXRKRHYqJyq0T1K0ZB2ht7jAKc9VC9vVw@mail.gmail.com>
Subject: Re: Why do we need to acquire the current thread's lock before
 context switching?
From: Dheeraj Kandula <dkandula@gmail.com>
To: Julian Elischer <julian@freebsd.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Sep 2013 13:56:02 -0000

Please find below the updated diff with the type fixed.

# svn diff
Index: sys/sys/proc.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- sys/sys/proc.h (revision 255514)
+++ sys/sys/proc.h (working copy)
@@ -197,12 +197,44 @@
 };

 /*
+ * Comments by: Svatopluk Kraus & John Baldwin <jhb@freebsd.org>
+ *
+ * Svatopluk Kraus' comment:
+ * Think about td_lock like something what is lent by current thread
owner. If
+ * a thread is running, it's owned by scheduler and td_lock points
+ * to scheduler lock. If a thread is sleeping, it's owned by sleeping queu=
e
+ * and td_lock points to sleep queue lock. If a thread is contested, it's
+ * owned by turnstile queue and td_lock points to turnstile queue lock.
And so
+ * on. This way an owner can work with owned threads safely without giant
+ * lock. The td_lock pointer is changed atomically, so it's safe.
+ *
+ * John Baldwin's comment:
+ * For example: take a thread that is asleep on a sleep
+ * queue.  td_lock points to the relevant SC_LOCK() for the sleep queue
chain
+ * in that case, so any other thread that wants to examine that thread's
+ * state ends up locking the sleep queue while it examines that thread.  I=
n
+ * particular, the thread that is doing a wakeup() can resume all of the
+ * sleeping threads for a wait channel by holding the one SC_LOCK() for
that
+ * wait channel since that will be td_lock for all those threads.
+ *
+ * In general mutexes are only unlocked by the thread that locks them,
+ * and the td_lock of the old thread is unlocked during sched_switch().
+ * However, the old thread has to grab td_lock of the new thread during
+ * sched_switch() and then hand it off to the new thread when it resumes.
+ * This is why sched_throw() and sched_switch() in ULE directly assign
+ * 'mtx_lock' of the run queue lock before calling cpu_throw() or
+ * cpu_switch().  That gives the effect that the new thread resumes while
+ * holding the lock pointed to by its td_lock.
+ */
+/*
  * Kernel runnable context (thread).
  * This is what is put to sleep and reactivated.
  * Thread context.  Processes may have multiple threads.
  */
 struct thread {
- struct mtx *volatile td_lock; /* replaces sched lock */
+ struct mtx *volatile td_lock; /* replaces sched lock. Look at the comment
+    * above for further details.
+                                            */
  struct proc *td_proc; /* (*) Associated process. */
  TAILQ_ENTRY(thread) td_plist; /* (*) All threads in this proc. */
  TAILQ_ENTRY(thread) td_runq; /* (t) Run queue. */



On Fri, Sep 13, 2013 at 12:09 AM, Julian Elischer <julian@freebsd.org>wrote=
:

> On 9/13/13 4:44 AM, Dheeraj Kandula wrote:
>
>> # svn diff
>> Index: sys/sys/proc.h
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D**=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D**=3D=3D=3D=3D=3D=3D=3D
>> --- sys/sys/proc.h (revision 255488)
>> +++ sys/sys/proc.h (working copy)
>> @@ -197,12 +197,44 @@
>>   };
>>
>>   /*
>> + * Comments by: Svatopluk Kraus & John Baldwin <jhb@freebsd.org>
>> + *
>> + * Svatopluk Kraus' comment:
>> + * Think about td_lock like something what is lent by current thread
>> owner. If
>> + * a thread is running, it's owned by scheduler and td_lock points
>> + * to scheduler lock. If a thread is sleeping, it's owned by sleeping
>> queue
>> + * and td_lock points to sleep queue lock. If a thread is contested, it=
's
>> + * owned by turnstile queue and td_lock points to turnstile queue lock.
>> And so
>> + * on. This way an owner can work with owned threads safely without gia=
nt
>> + * lock. The td_lock pointer is changed atomically, so it's safe.
>> + *
>> + * John Baldwin's comment:
>> + * For example: take a thread that is asleep on a sleep
>> + * queue.  td_lock points to the relevant SC_LOCK() for the sleep queue
>> chain
>> + * in that case, so any other thread that wants to examine that thread'=
s
>> + * state ends up locking the sleep queue while it examines that thread.
>>  In
>> + * particular, the thread that is doing a wakeup() can resume all of th=
e
>> + * sleeping threads for a wait channel by holding the one SC_LOCK() for
>> that
>> + * wait channel since that will be td_lock for all those threads.
>> + *
>> + * In general mutexes are only unlocked by the thread that locks them,
>> + * and the td_lock of the old thread is unlocked during sched_switch().
>> + * However, the old thread has to grab td_lock of the new thread during
>> + * sched_switch() and then hand it off to the new thread when it resume=
s.
>> + * This is why sched_throw() and sched_switch() in ULE directly assign
>> + * 'mtx_lock' of the run queue lock before calling cpu_throw() or
>> + * cpu_switch().  That gives the effect that the new thread resumes whi=
le
>> + * holding the lock pinted to by its td_lock.
>> + */
>> +/*
>>    * Kernel runnable context (thread).
>>    * This is what is put to sleep and reactivated.
>>    * Thread context.  Processes may have multiple threads.
>>    */
>>   struct thread {
>> - struct mtx *volatile td_lock; /* replaces sched lock */
>> + struct mtx *volatile td_lock; /* replaces sched lock. Look at the
>> comment
>> +    * above for further details.
>> +                                            */
>>    struct proc *td_proc; /* (*) Associated process. */
>>    TAILQ_ENTRY(thread) td_plist; /* (*) All threads in this proc. */
>>    TAILQ_ENTRY(thread) td_runq; /* (t) Run queue. */
>>
>>
>>
>> On Thu, Sep 12, 2013 at 4:21 PM, Alfred Perlstein <bright@mu.org> wrote:
>>
>>  Both these explanations are so great. Is there any way we can add this =
to
>>> proc.h or maybe document somewhere and then link to it from proc.h?
>>>
>>> Sent from my iPhone
>>>
>>> On Sep 12, 2013, at 5:24 AM, John Baldwin <jhb@freebsd.org> wrote:
>>>
>>>  On Thursday, September 12, 2013 7:16:20 am Dheeraj Kandula wrote:
>>>>
>>>>> Thanks a lot Svatopluk for the clarification. Right after I replied t=
o
>>>>> Alfred's mail, I realized that it can't be thread specific lock as it
>>>>> should also protect the scheduler variables. So if I understand it
>>>>>
>>>> right,
>>>
>>>> even though it is a mutex, it can be unlocked by another thread which =
is
>>>>> usually not the case with regular mutexes as the thread that locks it
>>>>>
>>>> must
>>>
>>>> unlock it unlike a binary semaphore. Isn't it?
>>>>>
>>>> It's less complicated than that. :)  It is a mutex, but to expand on
>>>> what
>>>> Svatopluk said with an example: take a thread that is asleep on a slee=
p
>>>> queue.  td_lock points to the relevant SC_LOCK() for the sleep queue
>>>>
>>> chain
>>>
>>>> in that case, so any other thread that wants to examine that thread's
>>>> state ends up locking the sleep queue while it examines that thread.  =
In
>>>> particular, the thread that is doing a wakeup() can resume all of the
>>>> sleeping threads for a wait channel by holding the one SC_LOCK() for
>>>> that
>>>> wait channel since that will be td_lock for all those threads.
>>>>
>>>> In general mutexes are only unlocked by the thread that locks them,
>>>> and the td_lock of the old thread is unlocked during sched_switch().
>>>> However, the old thread has to grab td_lock of the new thread during
>>>> sched_switch() and then hand it off to the new thread when it resumes.
>>>> This is why sched_throw() and sched_switch() in ULE directly assign
>>>> 'mtx_lock' of the run queue lock before calling cpu_throw() or
>>>> cpu_switch().  That gives the effect that the new thread resumes while
>>>> holding the lock pinted to by its td_lock.
>>>>
>>>                                    ^^ typo.. fix before commit
>
>
>>>>  Dheeraj
>>>>>
>>>>>
>>>>> On Thu, Sep 12, 2013 at 7:04 AM, Svatopluk Kraus <onwahe@gmail.com>
>>>>>
>>>> wrote:
>>>
>>>> Think about td_lock like something what is lent by current thread
>>>>>>
>>>>> owner.
>>>
>>>> If a thread is running, it's owned by scheduler and td_lock points
>>>>>> to scheduler lock. If a thread is sleeping, it's owned by sleeping
>>>>>>
>>>>> queue
>>>
>>>> and td_lock points to sleep queue lock. If a thread is contested, it's
>>>>>> owned by turnstile queue and td_lock points to turnstile queue lock.
>>>>>>
>>>>> And so
>>>
>>>> on. This way an owner can work with owned threads safely without giant
>>>>>> lock. The td_lock pointer is changed atomically, so it's safe.
>>>>>>
>>>>>> Svatopluk Kraus
>>>>>>
>>>>>> On Thu, Sep 12, 2013 at 12:48 PM, Dheeraj Kandula <dkandula@gmail.co=
m
>>>>>>
>>>>> wrote:
>>>>
>>>>> Thanks a lot Alfred for the clarification. So is the td_lock granular
>>>>>>>
>>>>>> i.e.
>>>
>>>> one separate lock for each thread but also used for protecting the
>>>>>>> scheduler variables or is it just one lock used by all threads and
>>>>>>> the
>>>>>>> scheduler as well. I will anyway go through the code that you
>>>>>>>
>>>>>> suggested
>>>
>>>> but
>>>>>>> just wanted to have a deeper understanding before I go about huntin=
g
>>>>>>>
>>>>>> in
>>>
>>>> the
>>>>>>> code.
>>>>>>>
>>>>>>> Dheeraj
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 12, 2013 at 3:10 AM, Alfred Perlstein <bright@mu.org>
>>>>>>>
>>>>>> wrote:
>>>
>>>> On 9/11/13 2:39 PM, Dheeraj Kandula wrote:
>>>>>>>>
>>>>>>>>  Hey All,
>>>>>>>>>
>>>>>>>>> When the current thread is being context switched with a newly
>>>>>>>>>
>>>>>>>> selected
>>>
>>>> thread, why is the current thread's lock acquired before context
>>>>>>>>>
>>>>>>>> switch =96
>>>>>>>
>>>>>>>> mi_switch() is invoked after thread_lock(td) is called. A thread a=
t
>>>>>>>>>
>>>>>>>> any
>>>
>>>> time runs only on one of the cores of a CPU. Hence when it is being
>>>>>>>>> context
>>>>>>>>> switched it is added either to the real time runq or the timeshar=
e
>>>>>>>>>
>>>>>>>> runq or
>>>>>>>
>>>>>>>> the idle runq with the lock still held or it is added to the sleep
>>>>>>>>>
>>>>>>>> queue
>>>>>>>
>>>>>>>> or
>>>>>>>>> the blocked queue. So this happens atomically even without the
>>>>>>>>> lock.
>>>>>>>>>
>>>>>>>> Isn't
>>>>>>>
>>>>>>>> it? Am I missing something here? I don't see any contention for th=
e
>>>>>>>>>
>>>>>>>> thread
>>>>>>>
>>>>>>>> in order to demand a lock for the thread which will basically
>>>>>>>>>
>>>>>>>> protect
>>>
>>>> the
>>>>>>>
>>>>>>>> contents of the thread structure for the thread.
>>>>>>>>>
>>>>>>>>> Dheeraj
>>>>>>>>>
>>>>>>>> The thread lock also happens to protect various scheduler variable=
s:
>>>>>>>>
>>>>>>>>         struct mtx      *volatile td_lock; /* replaces sched lock =
*/
>>>>>>>>
>>>>>>>> see sys/kern/sched_ule.c on how the thread lock td_lock is changed
>>>>>>>> depending on what the thread is doing.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alfred Perlstein
>>>>>>>>
>>>>>>> ______________________________**_________________
>>>>>>> freebsd-arch@freebsd.org mailing list
>>>>>>> http://lists.freebsd.org/**mailman/listinfo/freebsd-arch<http://lis=
ts.freebsd.org/mailman/listinfo/freebsd-arch>
>>>>>>> To unsubscribe, send any mail to "
>>>>>>>
>>>>>> freebsd-arch-unsubscribe@**freebsd.org<freebsd-arch-unsubscribe@free=
bsd.org>
>>> "
>>>
>>>> ______________________________**_________________
>>>>> freebsd-arch@freebsd.org mailing list
>>>>> http://lists.freebsd.org/**mailman/listinfo/freebsd-arch<http://lists=
.freebsd.org/mailman/listinfo/freebsd-arch>
>>>>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@**
>>>>> freebsd.org <freebsd-arch-unsubscribe@freebsd.org>"
>>>>>
>>>> --
>>>> John Baldwin
>>>> ______________________________**_________________
>>>> freebsd-arch@freebsd.org mailing list
>>>> http://lists.freebsd.org/**mailman/listinfo/freebsd-arch<http://lists.=
freebsd.org/mailman/listinfo/freebsd-arch>
>>>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@**
>>>> freebsd.org <freebsd-arch-unsubscribe@freebsd.org>"
>>>>
>>>>  ______________________________**_________________
>> freebsd-arch@freebsd.org mailing list
>> http://lists.freebsd.org/**mailman/listinfo/freebsd-arch<http://lists.fr=
eebsd.org/mailman/listinfo/freebsd-arch>
>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@**freebsd.org=
<freebsd-arch-unsubscribe@freebsd.org>
>> "
>>
>>
>>
>>
>