FreeBSD Mail Archives

Date:      Mon, 24 Jul 2000 12:45:20 +0930
From:      Greg Lehey <grog@lemis.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>, Chuck Paterson <cp@bsdi.com>, Bruce Evans <bde@zeta.org.au>
Cc:        David Greenman <dg@root.com>, freebsd-smp@FreeBSD.ORG
Subject:   : ipending (was: SMP progress (was: Stepping on Toes))
Message-ID:  <20000724124520.F82241@wantadilla.lemis.com>
In-Reply-To: <200007221620.JAA29862@apollo.backplane.com>; from dillon@apollo.backplane.com on Sat, Jul 22, 2000 at 09:20:23AM -0700
References:  <200007221620.JAA29862@apollo.backplane.com> <Pine.BSF.4.21.0007230257270.810-100000@besplex.bde.org> <200007221657.KAA20309@berserker.bsdi.com> <200007051652.KAA14768@berserker.bsdi.com> <20000722185705.A10221@wantadilla.lemis.com> <200007221620.JAA29862@apollo.backplane.com>

On Sunday, 23 July 2000 at  3:36:39 +1000, Bruce Evans wrote:
> On Sat, 22 Jul 2000, Matthew Dillon wrote:
>
>>> In a similar way, I'm removing interrupt mask copies in memory.  We
>>> still mask interrupts which aren't in use, but no others.  If anybody
>>> has any reason not to want to do this, we should talk about it.
>
> That would be wrong.  The interrupt masks are kept in memory because it
> is much faster to access them there.  Several hundred times faster than
> PIC accesses on current machines.  When I implemented this on 1992's
> machines, the memory copies were less than ten times faster.  The APIC
> case is not as bad.  I don't know the details of it.

It wasn't my intention to retrieve them from the PIC/APIC.  Certainly
if we have to set interrupt masks, we'd keep copies of them in memory.
More on this below.

>>     I think you still have to mask level interrupts, otherwise you won't
>>     be able to sti.   Some subsystems may generate a phenominal number
>>     of interrupts while the interrupt routine is running -- for example,
>                           ^^^ another
>>     the serial ports. 
>
> E.g., one serial port interrupting every 87 usec gives about 50
> interrupts while the keyboard interrupt handler is busy-waiting to
> program the keyboard LEDs.

Ugh.  Are keyboards that slow?  But is that such a big problem?  The
interrupts occur at the same rate when no keyboard interrupt is
running.  The interrupt only schedules the handler thread.  If it's
already scheduled, it's very fast.

>>     I think the masking was put in there as an optmiization not
>>     only for that, but also so the interrupt could be EOI'd early
>>     so as to allow a new interrupt to become pending while the
>>     interrupt routine was running (thus closing a potential window
>>     of opportunity where an interrupt might otherwise be missed).
>
> This only works right for interrupts other than the one being
> handled (we guarantee not to miss other interrupts provided they are
> live for at least the few usec needed for interrupt processing
> before the EOI).  Masking the current interrupt prevents ipending
> getting set for it, and there is race to exit from the interrupt
> handler and clear the masks so that a new transient interrupt can be
> seen before it goes away.  We certainly lose this race in some
> cases, e.g., when the exit is interrupted by another interrupt
> handler than takes too long.  I suspect that most "stray" interrupts
> are caused by losing this race.

On Saturday, 22 July 2000 at 10:57:40 -0600, Chuck Paterson wrote:
>
> 	When I read you mail I didn't answer because what you said
> sounded right. But, what was I thinking. The short answer is Matt
> is correct.
>
> 	When running in APIC mode BSD/OS uses auto EIO for the io APIC
> and counts on the level which gets set in the local APIC to mask
> interrupts while the handler is running. The local APIC gets and EIO
> when the handler is finished. If the thread blocks then the
> interrupt get masked in hardware if it is a level triggered and in
> all cases it gets EIO'd.  Given that you are just doing heavy weight
> interrupts this is the equivalent of blocking on an interrupt and
> you will always need to mask at least level triggered interrupts and
> EIO them.
>
> 	Matt's comment about bunches of extra level triggered
> interrupts being a problem is something that is going to have to be
> looked into with BSD/OS. The reason we don't mask the interrupts now
> is that doing the actual masking operation is soooo expensive.  I
> suspect we will want to mark which edge triggered interrupts we
> which to mask.

OK, I've been thinking about this.  I've also looked at the tapes
taken at Yahoo! last month, which I'm currently copying.  There you
(Chuck) stated that one of the main goals of SMPng was the elimination
of masking interrupts.  Matt stated then that it wouldn't work like
that, but we didn't finish the discussion.  We should probably do it
now.

My understanding is that the threaded interrupt stuff splits
interrupts (potentially) into two separate processing steps:

1.  Receiving the interrupt, handling the PIC/APIC, and scheduling an
    interrupt thread.

2.  Processing the interrupt thread.

I understand that we would do (1) immediately and (2) when we have
time for it, but as quickly as possible.  Sure, while we're handling a
slow operation like the keyboard LEDs, a lot of other interrupts can
come in.  But the interrupt rate won't change, the serial interrupts
still come in at 11.5 kHz.  All that happens there is that step (1)
will be executed each time, just like it would be at any other time.

The other issue is that we should potentially be able to handle one
interrupt in one processor and another in a different processor.  I
know that there are problems in this attitude, and that we have
deferred it, but I can't see how this can work at all if we mask the
interrupts.

Looking at the code we have at the moment, the "slow" interrupts write
two values to the PIC, one to set the mask and one to EOI, before
processing the interrupt, and another after the interrupt to unmask
the interrupt:

  Xintr3:
          pushl   $0                                      push a dummy error code
          pushl   $0                                      and a dummy trap type
          pushal                                          save general registers
          pushl   %ds                                     save segment registers
          pushl   %es
          pushl   %fs
          mov     $KDSEL, %ax
          mov     %ax,%ds                                 and point to kernel data segment
          mov     %ax,%es
          mov     %ax,%fs
          movb    imen + (irq_num >> 3), %al              get the correct byte of the mask register
          orb     $(1 << (irq_num % 8)), %al              and set the corresponding bit in %al
          movb    %al,imen + (irq_num >> 3)               store it back
          outb    %al,$ICU_IMR                            and set it in the ICU mask register
          movb    $0x20, %al				  EOI OCW
          outb    %al,$ICU                                enable the ICU
          movl    cpl, %eax                               get the current priority level
          testb   $(1 << (irq_num % 8)), %al              are we masking it in software?
          jne     2f                                      yes: just note it's pending
          incb    intr_nesting_level                      note one more nested interrupt
  Xresume3: incl  cnt + 0xc                               note another global interrupt
          movl    intr_countp + irq_num * 4,%eax          point to interrupt count
          incl    (%eax)                                  and increment it
          movl    cpl, %eax                               get CPL again
          pushl   %eax                                    save on the stack
          pushl   intr_unit + irq_num * 4                 unit number for this IRQ
          orl     intr_mask + irq_num * 4,%eax            mask for this IRQ
          movl    %eax,cpl                                is the new CPL
          sti                                             reenable interrupts
          call    *intr_handler + irq_num * 4             call handler (unit)
          cli                                             disable interrupts again
          movb    imen + (irq_num >> 3), %al              get the interrupt mask
          andb    $~(1 << (irq_num % 8)), %al             unmask this interrupt
          movb    %al,imen + (irq_num >> 3)               save new mask
          outb    %al,$ICU_IMR                            and set the ICU mask
          sti                                             enable interrupts again
          jmp     doreti                                  and return from the interrupt

This happens on every interrupt, and as Bruce points out, the writes
take a long time.  By comparison, the code I have now is much shorter
and contains only the minimum 1 outb instruction:

Xintr3:
	pushl	$0
	pushl	$0
	pushal
	pushl	%ds
	pushl	%es
	pushl	%fs
	mov	$0x10 ,%ax
	mov	%ax,%ds
	mov	%ax,%es
	mov	%ax,%fs
	movb	$0x20 ,%al
	outb	%al,$0x020
	incb	intr_nesting_level
Xresume3:
	pushl	3				IRQ number
	sti
	call	sched_ithd
	jmp	doreti

sched_ithd is pretty close to the BSD/OS code.  I've left the
debugging code and some comments out of this version:

void
sched_ithd(void *cookie)
{
	int irq;

	irq = (int) cookie;
	ithd *ir = ithds[irq];		/* find our process */
	cnt.v_intr++;			/* one more global interrupt */
	intr_countp[irq]++;		/* one more for this IRQ */

	ir->it_need = 1;
	mtx_enter(&sched_lock, MTX_SPIN);
	if (ir->it_run == 0) {
		ir->it_run = 2;
		setrunqueue(ir->it_proc);
		aston();
	}
	mtx_exit(&sched_lock, MTX_SPIN);
	aston();			/* ??? check priorities first? */
}

The issue here, of course, is taking the scheduler lock.  It seems to
me that we can avoid taking it if we first set it_need and then check
it_run and find it set, but I haven't thought through all the possible
race conditions there.  Suggestions are welcome.

Anyway, to get back to the original issue: as Bruce observes, the I/O
operations are very slow compared with instruction execution.  I'd
hope that this alternative would give us a performance boost.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000724124520.F82241>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation