From owner-freebsd-arch@FreeBSD.ORG  Thu Sep 12 21:29:05 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 7F09912D;
 Thu, 12 Sep 2013 21:29:05 +0000 (UTC)
 (envelope-from dkandula@gmail.com)
Received: from mail-we0-x22a.google.com (mail-we0-x22a.google.com
 [IPv6:2a00:1450:400c:c03::22a])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id DA5A928B4;
 Thu, 12 Sep 2013 21:29:04 +0000 (UTC)
Received: by mail-we0-f170.google.com with SMTP id w62so374638wes.15
 for <multiple recipients>; Thu, 12 Sep 2013 14:29:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=+DGrusQIXx8Lk8PudgSXWs6/GOtCb7M4dFuUeS9PKhw=;
 b=zulVg3s153eCf0OazioARZyzdvOwb8R495oBo0FijRlOtzfrB4JtXCGNP2gj/UowPr
 yPBZOHG982vDahFBeKoLlkUGhDx00YPPuuEmsFapw93hVjtfilDAHwexyHisEm3nXfwh
 kS0IEIxOsTBNn37Ek+8X+DnGNPWq08NYrKLTK1+CTsofaAVhPF7Qg9qCPxSmj0f2k5+i
 yQTNYab7a8VVsxd50mMJK+8fPzMLddmbR49mVzU01DwHGOS7UQst477b3BKokfEyvUCL
 905aySae4nZ1KGusuow805rovSi48DPr2uXfugFqsBP8jEqF2NntnxmnoHEhfUi/y5yw
 TV2Q==
MIME-Version: 1.0
X-Received: by 10.180.211.111 with SMTP id nb15mr7474838wic.55.1379021343262; 
 Thu, 12 Sep 2013 14:29:03 -0700 (PDT)
Received: by 10.194.38.167 with HTTP; Thu, 12 Sep 2013 14:29:03 -0700 (PDT)
In-Reply-To: <201309121710.01307.jhb@freebsd.org>
References: <CA+qNgxSVkSi88UC3gmfwigmP0UCO6dz+_Zxhf_=URK7p4c-Ghg@mail.gmail.com>
 <CA+qNgxSmTk8S95+DL5BQ8UFZWorNV0YwP9hiikRbDOJrFJp-7A@mail.gmail.com>
 <CA+qNgxQ3RZz+nYV97ABMQA7Uu97Nm4nW5ECTh5_CSkWtZSZ_4g@mail.gmail.com>
 <201309121710.01307.jhb@freebsd.org>
Date: Thu, 12 Sep 2013 17:29:03 -0400
Message-ID: <CA+qNgxTVN-zH1m7uZErD7Pj6m9m2nYQXxMmY-eSKCKj9JmxPpA@mail.gmail.com>
Subject: Re: Why do we need to acquire the current thread's lock before
 context switching?
From: Dheeraj Kandula <dkandula@gmail.com>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Alfred Perlstein <bright@mu.org>, Svatopluk Kraus <onwahe@gmail.com>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Sep 2013 21:29:05 -0000

Thanks John for the excellent explanation. That helps a lot.

Dheeraj


On Thu, Sep 12, 2013 at 5:10 PM, John Baldwin <jhb@freebsd.org> wrote:

> On Thursday, September 12, 2013 4:00:56 pm Dheeraj Kandula wrote:
> > Hey John,
> >        I think I get it now clearly.
> >
> > The td_lock of each thread actually points to the Thread Queue's lock on
> > which it is present. i.e. run queue which may either be the real time
> runq,
> > timeshare runq or the idle runq. For sleep the td_lock points to the
> > blocked_lock which is a global lock protecting the sleep queue I think.
> >
> > Before cpu_switch() is invoked, the old thread's td_lock is released as
> > shown below: the code is from sched_switch of sched_ule.c
> >
> >  lock_profile_release_lock<
> http://nxr.netbsd.org/source/s?defs=lock_profile_release_lock&project=src-freebsd
> >
> > (&TDQ_LOCKPTR<
> http://nxr.netbsd.org/source/xref/src-freebsd/sys/kern/sched_ule.c#TDQ_LOCKPTR
> >
> > (tdq)->lock_object<
> http://nxr.netbsd.org/source/s?defs=lock_object&project=src-freebsd>
> > );
> >
> > TDQ_LOCKPTR <
> http://nxr.netbsd.org/source/xref/src-freebsd/sys/kern/sched_ule.c#TDQ_LOCKPTR
> >(tdq)->mtx_lock
> > <http://nxr.netbsd.org/source/s?defs=mtx_lock&project=src-freebsd> =
> > (uintptr_t <
> http://nxr.netbsd.org/source/s?defs=uintptr_t&project=src-freebsd>)newtd
> > <
> http://nxr.netbsd.org/source/xref/src-freebsd/sys/kern/sched_ule.c#newtd>;
>
> This is not releasing the td_lock of the old thread.  TDQ_LOCK() is
> td_lock of the new thread that is about to run, and the assignment
> to mtx_lock is transferring ownership of TDQ_LOCK() from the old thread
> to the new thread.  The lock_profile calls fake an unlock / lock so the
> profiling code doesn't get confused by the handoff, but TDQ_LOCK() is
> not unlocked here.
>
> The old thread's td_lock is left alone if it is already TDQ_LOCK()
> (notice that in this snippet, the various branches assert this to be true):
>
>         /*
>          * The lock pointer in an idle thread should never change.  Reset
> it
>          * to CAN_RUN as well.
>          */
>         if (TD_IS_IDLETHREAD(td)) {
>                 MPASS(td->td_lock == TDQ_LOCKPTR(tdq));
>                 TD_SET_CAN_RUN(td);
>         } else if (TD_IS_RUNNING(td)) {
>                 MPASS(td->td_lock == TDQ_LOCKPTR(tdq));
>                 ...
>
> However, in the other cases, this code is called:
>
>         } else {
>                 /* This thread must be going to sleep. */
>                 TDQ_LOCK(tdq);
>                 mtx = thread_lock_block(td);
>                 tdq_load_rem(tdq, td);
>         }
>
> thread_lock_block() changes td_lock in the old thread to point to a
> dummy lock called the "block_lock" and then unlocks the old thread's
> td_lock.  However, it returns the previous value of td_lock as 'mtx'.
> That is then passed to cpu_switch().  cpu_switch() restores td_lock
> in the old thread to 'mtx'.
>
> The effect of this is to temporarily assign td_lock of the old thread
> to block_lock while the thread finishes switching out, but to be able
> to release the old thread's associated td_lock in C rather than having
> to do it from cpu_switch().  The 'block_lock' is a special lock that
> is always locked and never unlocked.  That causes any other threads
> that are trying to lock the old thread to spin until the old thread
> is finished switching out even after the old thread's td_lock has
> been released (since the new thread will spin on block_lock until
> cpu_switch() restores td_lock).
>
> > Later after cpu_switch is done,
> >
> >
> > lock_profile_obtain_lock_success
> > <
> http://nxr.netbsd.org/source/s?defs=lock_profile_obtain_lock_success&project=src-freebsd
> >(&TDQ_LOCKPTR
> > <
> http://nxr.netbsd.org/source/xref/src-freebsd/sys/kern/sched_ule.c#TDQ_LOCKPTR
> >(tdq)->lock_object
> > <http://nxr.netbsd.org/source/s?defs=lock_object&project=src-freebsd>,
> > 0, 0, __FILE__ <
> http://nxr.netbsd.org/source/s?defs=__FILE__&project=src-freebsd>,
> > __LINE__ <
> http://nxr.netbsd.org/source/s?defs=__LINE__&project=src-freebsd>);
> >
> >
> > is executed which locks the lock of the thread queue on the current
> > CPU which can be on a different CPU. I assume the new thread's td_lock
> > points to the current CPU's thread queue.
>
> As with the first hunk, this is not actually locking the lock, just
> pacifying the profiling code.  The new thread's td_lock does indeed
> point to the current CPU's thread queue.  This is known to be true
> because the new thread was chosen from the current CPU's thread
> queue.
>
> > Now it is clear that the mutex is unlocked by the same thread that locks
> it.
>
> Except not in this one case. :)  In the case of a context switch, the
> old thread locks TDQ_LOCK(), sets td_lock to blocked_lock and drops
> its old td_lock, and changes the internals of TDQ_LOCK() so that it is
> now owned by the new thread.  cpu_switch() is called which then restores
> td_lock in the old thread, switches the MD context (stack and registers,
> etc.).  The new thread returns on its own stack, and it now owns the
> TDQ_LOCK() that was locked by the old thread.  However, since the
> ownership was transferred, it can unlock TDQ_LOCK().
>
> Note that if a running thread is preempted, it doesn't do the block_lock
> business, instead it just transfers ownership of the TDQ_LOCK() it
> already holds to the new thread and never explicitly unlocks that lock.
>
> --
> John Baldwin
>