From owner-freebsd-arm@FreeBSD.ORG Sat Sep 6 01:33:21 2014 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8B7504E4; Sat, 6 Sep 2014 01:33:21 +0000 (UTC) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4AC231CCD; Sat, 6 Sep 2014 01:33:20 +0000 (UTC) Received: from [73.34.117.227] (helo=ilsoft.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1XQ4sR-000KpP-Tr; Sat, 06 Sep 2014 01:33:20 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id s861XImA014079; Fri, 5 Sep 2014 19:33:18 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18cz6DY/QbAf5z23EXpnbHD X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan [172.22.42.240] claimed to be [172.22.42.240] Subject: Re: Cubieboard: Spurious interrupt detected From: Ian Lepore To: John-Mark Gurney In-Reply-To: <20140906011526.GT82175@funkthat.com> References: <2279481.3MX4OEDuCl@quad> <20140905215702.GL3196@cicely7.cicely.de> <1409958716.1150.321.camel@revolution.hippie.lan> <20140906011526.GT82175@funkthat.com> Content-Type: text/plain; charset="us-ascii" Date: Fri, 05 Sep 2014 19:33:17 -0600 Message-ID: <1409967197.1150.339.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: "freebsd-arm@freebsd.org" , ticso@cicely.de X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Sep 2014 01:33:21 -0000 On Fri, 2014-09-05 at 18:15 -0700, John-Mark Gurney wrote: > Adrian Chadd wrote this message on Fri, Sep 05, 2014 at 17:44 -0700: > > On 5 September 2014 16:11, Ian Lepore wrote: > > > On Fri, 2014-09-05 at 23:57 +0200, Bernd Walter wrote: > > >> On Sat, Sep 06, 2014 at 01:43:23AM +0400, Maxim V FIlimonov wrote: > > >> > And another problem: every now and then the kernel says something like that: > > >> > Sep 5 19:22:37 kernel: Spurious interrupt detected > > >> > Sep 5 19:22:37 kernel: Spurious interrupt detected > > >> > Sep 5 19:23:46 last message repeated 10 times > > >> > > > >> > I've heard that FreeBSD happens to do that on ARM devices. What could be the > > >> > problem here? > > >> > > >> Means something generates inetrrupts, which are not handled by a driver. > > >> Could be the cause for your load problem too. > > >> > > > > > > No, that would be stray interrupts. Spurious interrupts happen when an > > > interrupt is asserted, but by time the processor asks the interrupt > > > controller for the current active interrupt, it is no longer active. > > > > > > One way it can happen is when an interrupt handler writes to a device to > > > clear a pending interrupt and that write takes a long time to complete > > > because the device is on a slow bus, and the interrupt controller is on > > > a faster bus. The EOI to the controller outraces the device write that > > > would clear the pending interrupt condition, so the processor is > > > re-interrupted, but by time it asks for the next active interrupt the > > > device write has finally completed and the interrupt is no longer > > > pending. > > > > > > That sequence used to happen a lot, and it was "fixed" by adding an > > > l2cache sync (basically a "drain write buffer") just before an EOI. You > > > sometimes still see an occasional spurious interrupt, but it shouldn't > > > be happening multiple times per second as seen in the logging above. > > > > Hm, interesting. I remember your discussion about it on IRC. The > > atheros code ends up working around this in the driver by doing a read > > from the ISR after writing out bits to clear things, so the clear is > > flushed out. > > > > I wonder if we should be asking all device drivers to be doing their > > own ISR flushing before returning from their interrupt handlers. > > This is required on PCI (that you do a read to clear the posted/pending > write)... So, IMO, yes, all device drivers should do the proper > clearing of their writes to the ISR... > But a driver can't assume that a read is sufficient on all architectures it may run on. bus_space_barrier() is the right way. Also, it's not just that a barrier is needed before exiting an isr... if the isr uses locking to synchronize with hardware access by the non-isr part of the driver, then the bus space barriers are needed in conjunction with the locking, so that, for example, the isr's usage of the hardware is truly complete before a lock is released. Scattered amongst 10 of the roughly 240 drivers in sys/dev there are 42 calls to bus_space_barrier(). Getting all the drivers fixed will be a big job. That's why I was thinking along the lines of an architecture-wide workaround with potentially a way to mark a driver as not needing the workaround once we get the fixing underway. -- Ian