From owner-freebsd-current@FreeBSD.ORG  Tue May 17 19:16:52 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E7DBA106564A;
	Tue, 17 May 2011 19:16:52 +0000 (UTC)
	(envelope-from max@love2party.net)
Received: from moutng.kundenserver.de (moutng.kundenserver.de
	[212.227.126.171])
	by mx1.freebsd.org (Postfix) with ESMTP id 75C6B8FC16;
	Tue, 17 May 2011 19:16:52 +0000 (UTC)
Received: from [10.54.190.172] (gw-105.extranet.sea01.isilon.com
	[74.85.160.105])
	by mrelayeu.kundenserver.de (node=mrbap2) with ESMTP (Nemesis)
	id 0MTNxP-1QEPYN1JwN-00RxOJ; Tue, 17 May 2011 21:16:50 +0200
Message-ID: <4DD2C99D.50203@love2party.net>
Date: Tue, 17 May 2011 12:16:45 -0700
From: Max Laier <max@love2party.net>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.2.15) Gecko/20110419 Lightning/1.0b2pre Thunderbird/3.1.9
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
References: <4DCD357D.6000109@FreeBSD.org> <4DD26720.3000001@FreeBSD.org>
	<4DD2A058.6050400@love2party.net>
	<201105171256.41091.jhb@freebsd.org>
In-Reply-To: <201105171256.41091.jhb@freebsd.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Provags-ID: V02:K0:dEpsw3ObiyUerVhf/DkEQATOSUtHuKmYbekOmF+Nu2k
	m/oijrgTqwSsy/C2YrXyMdKL875HNK9mkiuE58SevWb9HisZPw
	kEyAbXhaXdPih4OmWIjB4FmdTi6NxgBe84FmwWhoYLZ+gMIAC6
	OpSeCx7WZKrA5unI5GbitggTEUJXXgjyDgVY/37Gf1wDw397oz
	Vv404iXiPdMHTxipygdMg==
Cc: neel@freebsd.org, Andriy Gapon <avg@freebsd.org>,
	Attilio Rao <attilio@freebsd.org>,
	FreeBSD current <freebsd-current@freebsd.org>,
	Stephan Uphoff <ups@freebsd.org>, Peter Grehan <grehan@freebsd.org>
Subject: Re: proposed smp_rendezvous change
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 19:16:53 -0000

On 05/17/2011 09:56 AM, John Baldwin wrote:
> On Tuesday, May 17, 2011 12:20:40 pm Max Laier wrote:
>> On 05/17/2011 05:16 AM, John Baldwin wrote:
>> ...
>>> Index: kern/kern_switch.c
>>> ===================================================================
>>> --- kern/kern_switch.c (revision 221536)
>>> +++ kern/kern_switch.c (working copy)
>>> @@ -192,15 +192,22 @@
>>> critical_exit(void)
>>> {
>>> struct thread *td;
>>> - int flags;
>>> + int flags, owepreempt;
>>>
>>> td = curthread;
>>> KASSERT(td->td_critnest != 0,
>>> ("critical_exit: td_critnest == 0"));
>>>
>>> if (td->td_critnest == 1) {
>>> + owepreempt = td->td_owepreempt;
>>> + td->td_owepreempt = 0;
>>> + /*
>>> + * XXX: Should move compiler_memory_barrier() from
>>> + * rmlock to a header.
>>> + */
>>
>> XXX: If we get an interrupt at this point and td_owepreempt was zero,
>> the new interrupt will re-set it, because td_critnest is still non-zero.
>>
>> So we still end up with a thread that is leaking an owepreempt *and*
>> lose a preemption.
>
> I don't see how this can still leak owepreempt.  The nested interrupt should
> do nothing (except for possibly set owepreempt) until td_critnest is 0.

Exactly.  The interrupt sets owepreempt and after we return here, we set 
td_critnest to 0 and exit without clearing owepreempt.  Hence we leak 
the owepreempt.

> However, we can certainly lose preemptions.
>
> I wonder if we can abuse the high bit of td_critnest for the owepreempt flag
> so it is all stored in one cookie.  We only set owepreempt while holding
> thread_lock() (so interrupts are disabled), so I think we would be ok and not
> need atomic ops.
>
> Hmm, actually, the top-half code would have to use atomic ops.  Nuts.  Let me
> think some more.

I think these two really belong into one single variable.  Setting the 
owepreempt flag can be a normal RMW.  Increasing and decreasing critnest 
must be atomic (otherwise we could lose the flag) and dropping the final 
reference would work like this:

   if ((curthread->td_critnest & TD_CRITNEST_MASK) == 1) {
     unsigned int owe;
     owe = atomic_readandclear(&curthread->td_critnest);
     if (owe & TD_OWEPREEMPT_FLAG) {
       /* do the switch */
   }

That should do it ... I can put that into a patch, if we agree that's 
the right thing to do.

Thanks,
   Max