Date: Mon, 7 Jun 2010 09:55:54 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-current@freebsd.org Cc: Alexander Motin <mav@freebsd.org> Subject: Re: ioapic_assign_cpu() on active level-triggered interrupt Message-ID: <201006070955.54445.jhb@freebsd.org> In-Reply-To: <4C094635.40002@FreeBSD.org> References: <4C094635.40002@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 04 June 2010 2:30:13 pm Alexander Motin wrote: > Hi. > > I am working on driver for HPET event timers. It works mostly fine, > except after some cases when ioapic_assign_cpu() called while timer is > active. Under interrupt rate of 10KHz it is enough a dozen cpuset runs > to break it (with 1KHz - few dozens). When it happens, I can see that > timer is still running, interrupt status register is changing, but no > interrupts received. > > Timer uses level-triggered interrupts, so it is tolerant to interrupt > losses. I have tried to not acknowledge some, and they have immediately > got back to me again, as expected for level-triggering. Timer runs in > periodic mode, so it doesn't need handling to continue counting. > > I have reproduced it on two different i386 SMP systems: Core2Duo+ICH10 > and Core i5+PCH. With more experiments I have found that I can't trigger > this issue if following patch applied: > > --- io_apic.c.prev 2010-06-02 10:55:56.000000000 +0300 > +++ io_apic.c 2010-06-04 17:45:51.000000000 +0300 > @@ -363,7 +366,10 @@ ioapic_assign_cpu(struct intsrc *isrc, u > printf(") to lapic %u vector %u\n", intpin->io_cpu, > intpin->io_vector); > } > + ioapic_disable_source(isrc, PIC_NO_EOI); > + DELAY(10); > ioapic_program_intpin(intpin); > + ioapic_enable_source(isrc); > /* > * Free the old vector after the new one is established. This > is done > * to prevent races where we could miss an interrupt. > > It is is almost a hack and 10us is completely experimental. But it looks > like changing interrupt's APIC and vector in some moments of interrupt > processing may be not a good idea. > > Can somebody explain this behavior and propose some solution? Have > somebody seen it for regular PCI devices? It probably would be best to disable the source, however, you can't just re- enable it as it might already be disabled when you move it. It is also probably a bug that io_masked can be read w/o holding the icu_lock in ioapic_program_intpin(). I think the icu_lock should be pushed up to callers of ioapic_program_intpin(), and that you should explicitly do a simple write to mask the source a bit earlier. Something like this perhaps: Index: io_apic.c =================================================================== --- io_apic.c (revision 208714) +++ io_apic.c (working copy) @@ -261,16 +261,15 @@ * If a pin is completely invalid or if it is valid but hasn't * been enabled yet, just ensure that the pin is masked. */ + mtx_assert(&icu_lock, MA_OWNED); if (intpin->io_irq == IRQ_DISABLED || (intpin->io_irq < NUM_IO_INTS && intpin->io_vector == 0)) { - mtx_lock_spin(&icu_lock); low = ioapic_read(io->io_addr, IOAPIC_REDTBL_LO(intpin->io_intpin)); if ((low & IOART_INTMASK) == IOART_INTMCLR) ioapic_write(io->io_addr, IOAPIC_REDTBL_LO(intpin->io_intpin), low | IOART_INTMSET); - mtx_unlock_spin(&icu_lock); return; } @@ -312,14 +311,12 @@ } /* Write the values to the APIC. */ - mtx_lock_spin(&icu_lock); intpin->io_lowreg = low; ioapic_write(io->io_addr, IOAPIC_REDTBL_LO(intpin->io_intpin), low); value = ioapic_read(io->io_addr, IOAPIC_REDTBL_HI(intpin->io_intpin)); value &= ~IOART_DEST; value |= high; ioapic_write(io->io_addr, IOAPIC_REDTBL_HI(intpin->io_intpin), value); - mtx_unlock_spin(&icu_lock); } static int @@ -352,6 +349,12 @@ if (new_vector == 0) return (ENOSPC); + /* Mask the old intpin if it is enabled while it is migrated. */ + mtx_lock_spin(&icu_lock); + if (!intpin->io_masked) + ioapic_write(io->io_addr, IOAPIC_REDTBL_LO(intpin->io_intpin), + intpin->io_lowreg | IOART_INTMSET); + intpin->io_cpu = apic_id; intpin->io_vector = new_vector; if (isrc->is_handlers > 0) @@ -364,6 +367,8 @@ intpin->io_vector); } ioapic_program_intpin(intpin); + mtx_unlock_spin(&icu_lock); + /* * Free the old vector after the new one is established. This is done * to prevent races where we could miss an interrupt. @@ -399,9 +404,11 @@ /* Mask this interrupt pin and free its APIC vector. */ vector = intpin->io_vector; apic_disable_vector(intpin->io_cpu, vector); + mtx_lock_spin(&icu_lock); intpin->io_masked = 1; intpin->io_vector = 0; ioapic_program_intpin(intpin); + mtx_unlock_spin(&icu_lock); apic_free_vector(intpin->io_cpu, vector, intpin->io_irq); } } @@ -443,6 +450,7 @@ * XXX: Should we write to the ELCR if the trigger mode changes for * an EISA IRQ or an ISA IRQ with the ELCR present? */ + mtx_lock_spin(&icu_lock); if (intpin->io_bus == APIC_BUS_EISA) pol = INTR_POLARITY_HIGH; changed = 0; @@ -464,6 +472,7 @@ } if (changed) ioapic_program_intpin(intpin); + mtx_unlock_spin(&icu_lock); return (0); } @@ -473,8 +482,10 @@ struct ioapic *io = (struct ioapic *)pic; int i; + mtx_lock_spin(&icu_lock); for (i = 0; i < io->io_numintr; i++) ioapic_program_intpin(&io->io_pins[i]); + mtx_unlock_spin(&icu_lock); } /* If you find that you still need the DELAY(10), you could place it in the conditional block that masks the interrupt perhaps. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201006070955.54445.jhb>