Date: Mon, 24 Jul 2000 12:45:20 +0930 From: Greg Lehey <grog@lemis.com> To: Matthew Dillon <dillon@apollo.backplane.com>, Chuck Paterson <cp@bsdi.com>, Bruce Evans <bde@zeta.org.au> Cc: David Greenman <dg@root.com>, freebsd-smp@FreeBSD.ORG Subject: : ipending (was: SMP progress (was: Stepping on Toes)) Message-ID: <20000724124520.F82241@wantadilla.lemis.com> In-Reply-To: <200007221620.JAA29862@apollo.backplane.com>; from dillon@apollo.backplane.com on Sat, Jul 22, 2000 at 09:20:23AM -0700 References: <200007221620.JAA29862@apollo.backplane.com> <Pine.BSF.4.21.0007230257270.810-100000@besplex.bde.org> <200007221657.KAA20309@berserker.bsdi.com> <200007051652.KAA14768@berserker.bsdi.com> <20000722185705.A10221@wantadilla.lemis.com> <200007221620.JAA29862@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday, 23 July 2000 at 3:36:39 +1000, Bruce Evans wrote: > On Sat, 22 Jul 2000, Matthew Dillon wrote: > >>> In a similar way, I'm removing interrupt mask copies in memory. We >>> still mask interrupts which aren't in use, but no others. If anybody >>> has any reason not to want to do this, we should talk about it. > > That would be wrong. The interrupt masks are kept in memory because it > is much faster to access them there. Several hundred times faster than > PIC accesses on current machines. When I implemented this on 1992's > machines, the memory copies were less than ten times faster. The APIC > case is not as bad. I don't know the details of it. It wasn't my intention to retrieve them from the PIC/APIC. Certainly if we have to set interrupt masks, we'd keep copies of them in memory. More on this below. >> I think you still have to mask level interrupts, otherwise you won't >> be able to sti. Some subsystems may generate a phenominal number >> of interrupts while the interrupt routine is running -- for example, > ^^^ another >> the serial ports. > > E.g., one serial port interrupting every 87 usec gives about 50 > interrupts while the keyboard interrupt handler is busy-waiting to > program the keyboard LEDs. Ugh. Are keyboards that slow? But is that such a big problem? The interrupts occur at the same rate when no keyboard interrupt is running. The interrupt only schedules the handler thread. If it's already scheduled, it's very fast. >> I think the masking was put in there as an optmiization not >> only for that, but also so the interrupt could be EOI'd early >> so as to allow a new interrupt to become pending while the >> interrupt routine was running (thus closing a potential window >> of opportunity where an interrupt might otherwise be missed). > > This only works right for interrupts other than the one being > handled (we guarantee not to miss other interrupts provided they are > live for at least the few usec needed for interrupt processing > before the EOI). Masking the current interrupt prevents ipending > getting set for it, and there is race to exit from the interrupt > handler and clear the masks so that a new transient interrupt can be > seen before it goes away. We certainly lose this race in some > cases, e.g., when the exit is interrupted by another interrupt > handler than takes too long. I suspect that most "stray" interrupts > are caused by losing this race. On Saturday, 22 July 2000 at 10:57:40 -0600, Chuck Paterson wrote: > > When I read you mail I didn't answer because what you said > sounded right. But, what was I thinking. The short answer is Matt > is correct. > > When running in APIC mode BSD/OS uses auto EIO for the io APIC > and counts on the level which gets set in the local APIC to mask > interrupts while the handler is running. The local APIC gets and EIO > when the handler is finished. If the thread blocks then the > interrupt get masked in hardware if it is a level triggered and in > all cases it gets EIO'd. Given that you are just doing heavy weight > interrupts this is the equivalent of blocking on an interrupt and > you will always need to mask at least level triggered interrupts and > EIO them. > > Matt's comment about bunches of extra level triggered > interrupts being a problem is something that is going to have to be > looked into with BSD/OS. The reason we don't mask the interrupts now > is that doing the actual masking operation is soooo expensive. I > suspect we will want to mark which edge triggered interrupts we > which to mask. OK, I've been thinking about this. I've also looked at the tapes taken at Yahoo! last month, which I'm currently copying. There you (Chuck) stated that one of the main goals of SMPng was the elimination of masking interrupts. Matt stated then that it wouldn't work like that, but we didn't finish the discussion. We should probably do it now. My understanding is that the threaded interrupt stuff splits interrupts (potentially) into two separate processing steps: 1. Receiving the interrupt, handling the PIC/APIC, and scheduling an interrupt thread. 2. Processing the interrupt thread. I understand that we would do (1) immediately and (2) when we have time for it, but as quickly as possible. Sure, while we're handling a slow operation like the keyboard LEDs, a lot of other interrupts can come in. But the interrupt rate won't change, the serial interrupts still come in at 11.5 kHz. All that happens there is that step (1) will be executed each time, just like it would be at any other time. The other issue is that we should potentially be able to handle one interrupt in one processor and another in a different processor. I know that there are problems in this attitude, and that we have deferred it, but I can't see how this can work at all if we mask the interrupts. Looking at the code we have at the moment, the "slow" interrupts write two values to the PIC, one to set the mask and one to EOI, before processing the interrupt, and another after the interrupt to unmask the interrupt: Xintr3: pushl $0 push a dummy error code pushl $0 and a dummy trap type pushal save general registers pushl %ds save segment registers pushl %es pushl %fs mov $KDSEL, %ax mov %ax,%ds and point to kernel data segment mov %ax,%es mov %ax,%fs movb imen + (irq_num >> 3), %al get the correct byte of the mask register orb $(1 << (irq_num % 8)), %al and set the corresponding bit in %al movb %al,imen + (irq_num >> 3) store it back outb %al,$ICU_IMR and set it in the ICU mask register movb $0x20, %al EOI OCW outb %al,$ICU enable the ICU movl cpl, %eax get the current priority level testb $(1 << (irq_num % 8)), %al are we masking it in software? jne 2f yes: just note it's pending incb intr_nesting_level note one more nested interrupt Xresume3: incl cnt + 0xc note another global interrupt movl intr_countp + irq_num * 4,%eax point to interrupt count incl (%eax) and increment it movl cpl, %eax get CPL again pushl %eax save on the stack pushl intr_unit + irq_num * 4 unit number for this IRQ orl intr_mask + irq_num * 4,%eax mask for this IRQ movl %eax,cpl is the new CPL sti reenable interrupts call *intr_handler + irq_num * 4 call handler (unit) cli disable interrupts again movb imen + (irq_num >> 3), %al get the interrupt mask andb $~(1 << (irq_num % 8)), %al unmask this interrupt movb %al,imen + (irq_num >> 3) save new mask outb %al,$ICU_IMR and set the ICU mask sti enable interrupts again jmp doreti and return from the interrupt This happens on every interrupt, and as Bruce points out, the writes take a long time. By comparison, the code I have now is much shorter and contains only the minimum 1 outb instruction: Xintr3: pushl $0 pushl $0 pushal pushl %ds pushl %es pushl %fs mov $0x10 ,%ax mov %ax,%ds mov %ax,%es mov %ax,%fs movb $0x20 ,%al outb %al,$0x020 incb intr_nesting_level Xresume3: pushl 3 IRQ number sti call sched_ithd jmp doreti sched_ithd is pretty close to the BSD/OS code. I've left the debugging code and some comments out of this version: void sched_ithd(void *cookie) { int irq; irq = (int) cookie; ithd *ir = ithds[irq]; /* find our process */ cnt.v_intr++; /* one more global interrupt */ intr_countp[irq]++; /* one more for this IRQ */ ir->it_need = 1; mtx_enter(&sched_lock, MTX_SPIN); if (ir->it_run == 0) { ir->it_run = 2; setrunqueue(ir->it_proc); aston(); } mtx_exit(&sched_lock, MTX_SPIN); aston(); /* ??? check priorities first? */ } The issue here, of course, is taking the scheduler lock. It seems to me that we can avoid taking it if we first set it_need and then check it_run and find it set, but I haven't thought through all the possible race conditions there. Suggestions are welcome. Anyway, to get back to the original issue: as Bruce observes, the I/O operations are very slow compared with instruction execution. I'd hope that this alternative would give us a performance boost. Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000724124520.F82241>