Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Oct 2012 16:56:59 +0000
From:      Attilio Rao <attilio@freebsd.org>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        freebsd-hackers <freebsd-hackers@freebsd.org>, Jeff Roberson <jeff@freebsd.org>
Subject:   Re: ule+smp: small optimization for turnstile priority lending
Message-ID:  <CAJ-FndAzuCBCuZurMehDu=OYRCKJ=0RJ8-VcjUE5QRJpxdyWzw@mail.gmail.com>
In-Reply-To: <506C461F.2050008@FreeBSD.org>
References:  <50587F8D.9060102@FreeBSD.org> <CAJ-FndCWzTBRYsA0mFDCj8RU06ZUTi3G0LeEFcS9_c5zKd%2BgWQ@mail.gmail.com> <5058C68B.1010508@FreeBSD.org> <CAJ-FndC8j12a95N0QprYTLJLC06R8jjcaHuxEZKu5D9RHW=ZVw@mail.gmail.com> <50596019.5060708@FreeBSD.org> <CAJ-FndBxWYOkRCn-DZTdZ%2BB4RpvsvaxtwDyMc8M7YhRj9DaL2w@mail.gmail.com> <505AF836.7050004@FreeBSD.org> <CAJ-FndAiiH5ToUtns=jC7_V27BkeOvJ8_T9SOmw2DGdFyfvTdg@mail.gmail.com> <506C461F.2050008@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Oct 3, 2012 at 3:05 PM, Andriy Gapon <avg@freebsd.org> wrote:
> on 20/09/2012 16:14 Attilio Rao said the following:
>> On 9/20/12, Andriy Gapon <avg@freebsd.org> wrote:
> [snip]
>>> The patch works well as far as I can tell.  Thank you!
>>> There is one warning with full witness enables but it appears to be harmless
>>> (so
>>> far):
>>
>> Andriy,
>> thanks a lot for your testing and reports you made so far.
>> Unfortunately I'm going off for 2 weeks now and I won't work on
>> FreeBSD for that timeframe. I will get back to those in 2 weeks then.
>> If you want  to play more with this idea feel free to extend/fix/etc.
>> this patch.
>
> Unfortunately I haven't found time to work on this further, but I have some
> additional thoughts.
>
> First, I think that the witness message was not benign and it actually warns about
> the same kind of deadlock that I originally had.
> The problem is that sched_lend_prio() is called with target thread's td_lock held,
> which is a lock of tdq on the thread's CPU.  Then, in your patch, we acquire
> current tdq's lock to modify its load.  But if two tdq locks are to be held at the
> same time, then they must be locked using tdq_lock_pair, so that lock order is
> maintained.  With the patch no tdq lock order can be maintained (can arbitrary)
> and thus it is possible to run into a deadlock.

Indeed. I realized this after thinking about this problem while I was
on holiday.

>
> I see two possibilities here, but don't rule out that there can be more.
>
> 1. Do something like my patch does.  That is, manipulate current thread's tdq in
> advance before any other thread/tdq locks are acquired (or td_lock is changed to
> point to a different lock and current tdq is unlocked).  The API can be made more
> generic in nature.  E.g. it can look like this:
> void
> sched_thread_block(struct thread *td, int inhibitor)
> {
>         struct tdq *tdq;
>
>         THREAD_LOCK_ASSERT(td, MA_OWNED);
>         KASSERT(td == curthread,
>             ("sched_thread_block: only curthread is supported"));
>         tdq = TDQ_SELF();
>         TDQ_LOCK_ASSERT(tdq, MA_OWNED);
>         MPASS(td->td_lock == TDQ_LOCKPTR(tdq));
>         TD_SET_INHIB(td, inhibitor);
>         tdq_load_rem(tdq, td);
>         tdq_setlowpri(tdq, td);
> }
>
>
> 2. Try to do things from sched_lend_prio based on curthread's state.  This, as it
> seems, would require completely lock-less manipulations of current tdq.  E.g. for
> tdq_load we could use atomic operations (it is already accessed locklessly, but
> not modified).  But for tdq_lowpri a more elaborate trick would be required like
> having a separate field for a temporary value.

No, this is not going to work because tdq_lowpri and tdq_load are
updated and used in the same critical path (and possibly also other
tdq* members in tdq_runqueue_add(), for example, but I didn't bother
to also check that).

Let me think some more about this and get back to you.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndAzuCBCuZurMehDu=OYRCKJ=0RJ8-VcjUE5QRJpxdyWzw>