Date: Sat, 6 Sep 2014 00:10:55 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: Adrian Chadd <adrian@freebsd.org> Cc: "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, ticso@cicely.de, Ian Lepore <ian@freebsd.org> Subject: Re: Cubieboard: Spurious interrupt detected Message-ID: <20140906071055.GW82175@funkthat.com> In-Reply-To: <CAJ-VmonPttv58SGziDda--GooyLJdCcsGXCzP-UyGkO5oO2i=Q@mail.gmail.com> References: <2279481.3MX4OEDuCl@quad> <20140905215702.GL3196@cicely7.cicely.de> <1409958716.1150.321.camel@revolution.hippie.lan> <CAJ-Vmo=EJVFqNnMo_dzevGvFWLSR6LVfYbYmOot1bLZbCvVMTQ@mail.gmail.com> <20140906011526.GT82175@funkthat.com> <1409967197.1150.339.camel@revolution.hippie.lan> <20140906045403.GU82175@funkthat.com> <CAJ-VmonPttv58SGziDda--GooyLJdCcsGXCzP-UyGkO5oO2i=Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Adrian Chadd wrote this message on Fri, Sep 05, 2014 at 23:45 -0700: > On 5 September 2014 21:54, John-Mark Gurney <jmg@funkthat.com> wrote: > > Ian Lepore wrote this message on Fri, Sep 05, 2014 at 19:33 -0600: > >> On Fri, 2014-09-05 at 18:15 -0700, John-Mark Gurney wrote: > >> > Adrian Chadd wrote this message on Fri, Sep 05, 2014 at 17:44 -0700: > >> > > On 5 September 2014 16:11, Ian Lepore <ian@freebsd.org> wrote: > >> > > > On Fri, 2014-09-05 at 23:57 +0200, Bernd Walter wrote: > >> > > >> On Sat, Sep 06, 2014 at 01:43:23AM +0400, Maxim V FIlimonov wrote: > >> > > >> > And another problem: every now and then the kernel says something like that: > >> > > >> > Sep 5 19:22:37 kernel: Spurious interrupt detected > >> > > >> > Sep 5 19:22:37 kernel: Spurious interrupt detected > >> > > >> > Sep 5 19:23:46 last message repeated 10 times > >> > > >> > > >> > > >> > I've heard that FreeBSD happens to do that on ARM devices. What could be the > >> > > >> > problem here? > >> > > >> > >> > > >> Means something generates inetrrupts, which are not handled by a driver. > >> > > >> Could be the cause for your load problem too. > >> > > >> > >> > > > > >> > > > No, that would be stray interrupts. Spurious interrupts happen when an > >> > > > interrupt is asserted, but by time the processor asks the interrupt > >> > > > controller for the current active interrupt, it is no longer active. > >> > > > > >> > > > One way it can happen is when an interrupt handler writes to a device to > >> > > > clear a pending interrupt and that write takes a long time to complete > >> > > > because the device is on a slow bus, and the interrupt controller is on > >> > > > a faster bus. The EOI to the controller outraces the device write that > >> > > > would clear the pending interrupt condition, so the processor is > >> > > > re-interrupted, but by time it asks for the next active interrupt the > >> > > > device write has finally completed and the interrupt is no longer > >> > > > pending. > >> > > > > >> > > > That sequence used to happen a lot, and it was "fixed" by adding an > >> > > > l2cache sync (basically a "drain write buffer") just before an EOI. You > >> > > > sometimes still see an occasional spurious interrupt, but it shouldn't > >> > > > be happening multiple times per second as seen in the logging above. > >> > > > >> > > Hm, interesting. I remember your discussion about it on IRC. The > >> > > atheros code ends up working around this in the driver by doing a read > >> > > from the ISR after writing out bits to clear things, so the clear is > >> > > flushed out. > >> > > > >> > > I wonder if we should be asking all device drivers to be doing their > >> > > own ISR flushing before returning from their interrupt handlers. > >> > > >> > This is required on PCI (that you do a read to clear the posted/pending > >> > write)... So, IMO, yes, all device drivers should do the proper > >> > clearing of their writes to the ISR... > >> > > >> > >> But a driver can't assume that a read is sufficient on all architectures > >> it may run on. bus_space_barrier() is the right way. Also, it's not > > > > Except that I don't think even on PCI a bus_space_barrier is sufficient... > > It isn't. > > The device itself may have FIFOs and internal busses that also need to > be flushed. Partly this depends upon which bus the device is on... Though we'd like our drivers to be bus agnostic, items like these force them not to be... > > I was just looking at i386's implementation of bus_space_barrier and > > it just does a stack access... This won't be sufficient to clear any > > PCI bridges that may have the write still pending... > > Right. The memory barrier semantics right now don't at all guarantee > that bus and device FIFOs have actually been flushed. > > So I don't think doing it using the existing bus space barrier > semantics is 'right'. For interrupts, it's highly likely that we do > actually need device drivers to read from their interrupt register to > ensure the update has been posted before returning. That's better than > causing entire L2 cache flushes. > > Question is - can we expose this somehow as a generic device method, > so the higher bus layers can actually do something with it, or should > we just leave it to device drivers to correctly do? Sadly, I really think it's up to the device driver... Are we going to require the driver to expose a register that gets read before we EOI the interrupt? what if another bus does it differently? > (Also - do any of the freebsd device driver books or the handbook mention this?) Well, when I did a quick search, I found an HP Writing PCI Driver guide... Googled: "pci read after write isr" and the first link was: http://h21007.www2.hp.com/portal/download/files/unprot/ddk/ddg/chap8.pdf PCI Transaction Ordering - Write Side Effects And I just checked the old D&I (haven't gotten my new one yet) and it's device driver section is quite small and doesn't mention this issue... But it also doesn't talk about how to setup interrupts or resources... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140906071055.GW82175>