Date: Mon, 24 Jul 2000 12:45:20 +0930 From: Greg Lehey <grog@lemis.com> To: Matthew Dillon <dillon@apollo.backplane.com>, Chuck Paterson <cp@bsdi.com>, Bruce Evans <bde@zeta.org.au> Cc: David Greenman <dg@root.com>, freebsd-smp@FreeBSD.ORG Subject: : ipending (was: SMP progress (was: Stepping on Toes)) Message-ID: <20000724124520.F82241@wantadilla.lemis.com> In-Reply-To: <200007221620.JAA29862@apollo.backplane.com>; from dillon@apollo.backplane.com on Sat, Jul 22, 2000 at 09:20:23AM -0700 References: <200007221620.JAA29862@apollo.backplane.com> <Pine.BSF.4.21.0007230257270.810-100000@besplex.bde.org> <200007221657.KAA20309@berserker.bsdi.com> <200007051652.KAA14768@berserker.bsdi.com> <20000722185705.A10221@wantadilla.lemis.com> <200007221620.JAA29862@apollo.backplane.com>
index | next in thread | previous in thread | raw e-mail
On Sunday, 23 July 2000 at 3:36:39 +1000, Bruce Evans wrote:
> On Sat, 22 Jul 2000, Matthew Dillon wrote:
>
>>> In a similar way, I'm removing interrupt mask copies in memory. We
>>> still mask interrupts which aren't in use, but no others. If anybody
>>> has any reason not to want to do this, we should talk about it.
>
> That would be wrong. The interrupt masks are kept in memory because it
> is much faster to access them there. Several hundred times faster than
> PIC accesses on current machines. When I implemented this on 1992's
> machines, the memory copies were less than ten times faster. The APIC
> case is not as bad. I don't know the details of it.
It wasn't my intention to retrieve them from the PIC/APIC. Certainly
if we have to set interrupt masks, we'd keep copies of them in memory.
More on this below.
>> I think you still have to mask level interrupts, otherwise you won't
>> be able to sti. Some subsystems may generate a phenominal number
>> of interrupts while the interrupt routine is running -- for example,
> ^^^ another
>> the serial ports.
>
> E.g., one serial port interrupting every 87 usec gives about 50
> interrupts while the keyboard interrupt handler is busy-waiting to
> program the keyboard LEDs.
Ugh. Are keyboards that slow? But is that such a big problem? The
interrupts occur at the same rate when no keyboard interrupt is
running. The interrupt only schedules the handler thread. If it's
already scheduled, it's very fast.
>> I think the masking was put in there as an optmiization not
>> only for that, but also so the interrupt could be EOI'd early
>> so as to allow a new interrupt to become pending while the
>> interrupt routine was running (thus closing a potential window
>> of opportunity where an interrupt might otherwise be missed).
>
> This only works right for interrupts other than the one being
> handled (we guarantee not to miss other interrupts provided they are
> live for at least the few usec needed for interrupt processing
> before the EOI). Masking the current interrupt prevents ipending
> getting set for it, and there is race to exit from the interrupt
> handler and clear the masks so that a new transient interrupt can be
> seen before it goes away. We certainly lose this race in some
> cases, e.g., when the exit is interrupted by another interrupt
> handler than takes too long. I suspect that most "stray" interrupts
> are caused by losing this race.
On Saturday, 22 July 2000 at 10:57:40 -0600, Chuck Paterson wrote:
>
> When I read you mail I didn't answer because what you said
> sounded right. But, what was I thinking. The short answer is Matt
> is correct.
>
> When running in APIC mode BSD/OS uses auto EIO for the io APIC
> and counts on the level which gets set in the local APIC to mask
> interrupts while the handler is running. The local APIC gets and EIO
> when the handler is finished. If the thread blocks then the
> interrupt get masked in hardware if it is a level triggered and in
> all cases it gets EIO'd. Given that you are just doing heavy weight
> interrupts this is the equivalent of blocking on an interrupt and
> you will always need to mask at least level triggered interrupts and
> EIO them.
>
> Matt's comment about bunches of extra level triggered
> interrupts being a problem is something that is going to have to be
> looked into with BSD/OS. The reason we don't mask the interrupts now
> is that doing the actual masking operation is soooo expensive. I
> suspect we will want to mark which edge triggered interrupts we
> which to mask.
OK, I've been thinking about this. I've also looked at the tapes
taken at Yahoo! last month, which I'm currently copying. There you
(Chuck) stated that one of the main goals of SMPng was the elimination
of masking interrupts. Matt stated then that it wouldn't work like
that, but we didn't finish the discussion. We should probably do it
now.
My understanding is that the threaded interrupt stuff splits
interrupts (potentially) into two separate processing steps:
1. Receiving the interrupt, handling the PIC/APIC, and scheduling an
interrupt thread.
2. Processing the interrupt thread.
I understand that we would do (1) immediately and (2) when we have
time for it, but as quickly as possible. Sure, while we're handling a
slow operation like the keyboard LEDs, a lot of other interrupts can
come in. But the interrupt rate won't change, the serial interrupts
still come in at 11.5 kHz. All that happens there is that step (1)
will be executed each time, just like it would be at any other time.
The other issue is that we should potentially be able to handle one
interrupt in one processor and another in a different processor. I
know that there are problems in this attitude, and that we have
deferred it, but I can't see how this can work at all if we mask the
interrupts.
Looking at the code we have at the moment, the "slow" interrupts write
two values to the PIC, one to set the mask and one to EOI, before
processing the interrupt, and another after the interrupt to unmask
the interrupt:
Xintr3:
pushl $0 push a dummy error code
pushl $0 and a dummy trap type
pushal save general registers
pushl %ds save segment registers
pushl %es
pushl %fs
mov $KDSEL, %ax
mov %ax,%ds and point to kernel data segment
mov %ax,%es
mov %ax,%fs
movb imen + (irq_num >> 3), %al get the correct byte of the mask register
orb $(1 << (irq_num % 8)), %al and set the corresponding bit in %al
movb %al,imen + (irq_num >> 3) store it back
outb %al,$ICU_IMR and set it in the ICU mask register
movb $0x20, %al EOI OCW
outb %al,$ICU enable the ICU
movl cpl, %eax get the current priority level
testb $(1 << (irq_num % 8)), %al are we masking it in software?
jne 2f yes: just note it's pending
incb intr_nesting_level note one more nested interrupt
Xresume3: incl cnt + 0xc note another global interrupt
movl intr_countp + irq_num * 4,%eax point to interrupt count
incl (%eax) and increment it
movl cpl, %eax get CPL again
pushl %eax save on the stack
pushl intr_unit + irq_num * 4 unit number for this IRQ
orl intr_mask + irq_num * 4,%eax mask for this IRQ
movl %eax,cpl is the new CPL
sti reenable interrupts
call *intr_handler + irq_num * 4 call handler (unit)
cli disable interrupts again
movb imen + (irq_num >> 3), %al get the interrupt mask
andb $~(1 << (irq_num % 8)), %al unmask this interrupt
movb %al,imen + (irq_num >> 3) save new mask
outb %al,$ICU_IMR and set the ICU mask
sti enable interrupts again
jmp doreti and return from the interrupt
This happens on every interrupt, and as Bruce points out, the writes
take a long time. By comparison, the code I have now is much shorter
and contains only the minimum 1 outb instruction:
Xintr3:
pushl $0
pushl $0
pushal
pushl %ds
pushl %es
pushl %fs
mov $0x10 ,%ax
mov %ax,%ds
mov %ax,%es
mov %ax,%fs
movb $0x20 ,%al
outb %al,$0x020
incb intr_nesting_level
Xresume3:
pushl 3 IRQ number
sti
call sched_ithd
jmp doreti
sched_ithd is pretty close to the BSD/OS code. I've left the
debugging code and some comments out of this version:
void
sched_ithd(void *cookie)
{
int irq;
irq = (int) cookie;
ithd *ir = ithds[irq]; /* find our process */
cnt.v_intr++; /* one more global interrupt */
intr_countp[irq]++; /* one more for this IRQ */
ir->it_need = 1;
mtx_enter(&sched_lock, MTX_SPIN);
if (ir->it_run == 0) {
ir->it_run = 2;
setrunqueue(ir->it_proc);
aston();
}
mtx_exit(&sched_lock, MTX_SPIN);
aston(); /* ??? check priorities first? */
}
The issue here, of course, is taking the scheduler lock. It seems to
me that we can avoid taking it if we first set it_need and then check
it_run and find it set, but I haven't thought through all the possible
race conditions there. Suggestions are welcome.
Anyway, to get back to the original issue: as Bruce observes, the I/O
operations are very slow compared with instruction execution. I'd
hope that this alternative would give us a performance boost.
Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000724124520.F82241>
