FreeBSD Mail Archives

Date:      Wed, 01 Nov 2000 10:12:33 -0800 (PST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        drony@spray.se, freebsd-smp@FreeBSD.org
Subject:   Re: HLT
Message-ID:  <XFMail.001101101233.jhb@FreeBSD.org>
In-Reply-To: <200011011052.DAA00688@usr02.primenet.com>

On 01-Nov-00 Terry Lambert wrote:
>> Err, no.  This has nothing to do with Giant, zero, zilch, nada.  Please do
>> go
>> read the code before making such claims.  It has everything to do with
>> making
>> sure that you don't put a CPU to sleep when there is work for it to do, and
>> to make sure you don't increase interrupt latency by having to unnecessarily
>> use an IPI to wake up a sleeping CPU so it can run a thread that suddenly
>> appears.  One can try hacking the code to enable hlt if one really wants it
>> (just change the #ifndef SMP to #ifndef SMP_XXX) but it will have a negative
>> effect on interrupt responsiveness when at least one CPU is idle.
> 
> I thought the issue was a missing IPI in the scheduler code, when
> something became ready to run, where you could have a halted CPU
> waiting for work, and an active CPU that originated the work via
> handling an interrupt, and failed to IPI the halted CPU like it
> should to tell it that there were ready-to-run processes?

Yes.  In the world of interrupt threads that we know have, delaying
running the thread can result in delaying to run an interrupt handler,
thus increasing interrupt latency.   Granted, I'm more familiar with
how we do this in SMPng at the moment, and that is the position I am talking
from.  We still need some sort of IPI to shoot a sleeping CPU to wake it up
when there is work to do.

> It seems to me that we are not talking about increased interrupt
> latency, unless the halted CPY holds the BGL.
> 
> I was under the impression that the halted CPU would wake up
> from the I/O APIC IPI, should an interrupt occur, since that's
> sent to all CPUs, and the system currently operates in virtual
> wire mode.

Hmm.  I can't find in the code where we setup virtual wire mode, although I'm
not an APIC whiz, but I do know that our interrupt, handling code assumes
that only 1 CPU will get a given interrupt, but that interrupts may be handed
out in a round-robin sort of fashion to different CPU's by the APIC.  I have
seen this happen in my kernel tracing during the SMPng stuff, and from the CPUs'
perspective, only one gets an I/O interrupt.

> Please help me understand why there is interrupt latency, if not
> for the BGL being held during HLT on idle?

In -current we don't hold Giant during idle.  In -stable I believe you are
correct though.

> I think the problem is that there is no IPI when the task is
> put on the scheduler read-to-run; I guess I just always sort of
> assumed that the reason for that was the BGL, not lack of an
> IPI instruction.

In current the BGL is no longer a problem during the idle loop, so it might be
possible to select a sleeping CPU and shoot it with a wakeup in setrunqueue(3).

> I still think the fix for this is to make the idle task a real
> task, with a HLT.  Then you differentiate the APIC IPI (virtual
> wire, interrupts to handle) from the scheduler IPI (wake up, you
> have work to do).

In -current we have per-CPU idle processes already. :)

> This really gets down to wanting per CPU scheduler queues, so
> that the scheduler IPI can be targetted, instead of broadcast.

Perhaps, as long as you do your checking in setrunqueue() to wake a
sleeping CPU while you are holding sched_lock (again, in the context of
-current), then you can safely just use a bitmask of currently HLT'd CPU's and
pick one to shoot right then and there.  The sched_lock would keep two CPU's
from trying to shoot the same CPU.

> If the CPU never mixes interrupt processing with scheduler
> checking (which it wouldn't have to, without a global scheduler
> queue), then you could always know that another CPU was idle,
> without having to lock, since it would have zero runnable tasks
> on its task list (idle would always be runnable, so if it isn't
> there, it must be running, and therefore the CPU must be halted,
> so doing a read is safe; even if you have a miss, it's not fatal,
> the CPU merely gets [non-broadcast] IPI'ed twice).

We already don't put idle on the runqueue but just default to it in
chooseproc() if nothing is runnable.  One thing to note is that we already have
the scheduler lock in setrunqueue() to protect the runqueue as well as p_stat,
so we wouldn't be adding any extra locking to simply support a bitmask of
sleeping CPU's.  Once we can get a kernel that actually allows more than one
CPU in it to do actual work, then time can be spent on optimizations such as
this.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.001101101233.jhb>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation