From owner-freebsd-mips@FreeBSD.ORG Sat Jul 26 20:29:22 2014 Return-Path: Delivered-To: freebsd-mips@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A5B1B79F; Sat, 26 Jul 2014 20:29:22 +0000 (UTC) Received: from server1.weites.net (mail.weites.com [89.188.29.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4EB302F0B; Sat, 26 Jul 2014 20:29:21 +0000 (UTC) Received: from [10.14.92.96] (5248604F.cm-4-1b.dynamic.ziggo.nl [82.72.96.79]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: harm@weites.com) by server1.weites.net (Postfix) with ESMTPSA id 84646C99B1; Sat, 26 Jul 2014 22:29:17 +0200 (CEST) Message-ID: <53D40F9E.6020409@weites.com> Date: Sat, 26 Jul 2014 22:29:18 +0200 From: Harm Weites User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: interrupt storm arge0, tplink 1043nd References: <53CEB6B1.9050301@weites.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: "freebsd-mips@freebsd.org" X-BeenThere: freebsd-mips@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Porting FreeBSD to MIPS List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jul 2014 20:29:22 -0000 Oops, ofcourse it didn't work... After passing the correct argument (&sc->intr_status, instead of sc) I got answers. These are the results of three times sysctl, producing 4 lines per run (presumably 2 lines arge0 and 2 lines for the dumb arge1). First run took place after boot, second a while after that and third just after the storm. interrupt 1 count 135 interrupt 1 count 135 interrupt 1 count 0 interrupt 1 count 0 interrupt 1 count 4738 interrupt 1 count 4738 interrupt 1 count 0 interrupt 1 count 0 interrupt 1 count 5041 interrupt 1 count 5041 interrupt 1 count 0 interrupt 1 count 0 interrupt 4 count 108 interrupt 4 count 108 interrupt 4 count 0 interrupt 4 count 0 interrupt 4 count 15843 interrupt 4 count 15844 interrupt 4 count 0 interrupt 4 count 0 interrupt 4 count 35311 interrupt 4 count 35311 interrupt 4 count 0 interrupt 4 count 0 interrupt 6 count 0 interrupt 6 count 0 interrupt 6 count 0 interrupt 6 count 0 interrupt 6 count 4 interrupt 6 count 4 interrupt 6 count 0 interrupt 6 count 0 interrupt 6 count 11 interrupt 6 count 11 interrupt 6 count 0 interrupt 6 count 0 Interrupt 4 went up rather quick, so that likely is the bad guy. Right? Regards, Harm op 22-07-14 21:26, Adrian Chadd schreef: > Hi! > > So, ignore the ath0 stuff for now. int2 should be arge0, right? > > what's vmstat -ia say? > > Assuming it's actually arge0, we need to add some debugging counters > to the interrupt path to count how many of each interrupt are > occuring. The stuff i stuck behind ARGEDEBUG() is useful for debugging > some silly bugs but not at the rate that you're getting interrupts. > > So I'd add something like this to the arge softc struct: > > uint32_t intr_status[32]; > > .. then in the interrupt routine, something like this: > > temp_status = status; > for (i = 0; i < 32; i++) { > if (temp_status & 1) { > intr_status[i]++; > } > temp_status = temp_status >> 1; > } > > That'll count the number of interrupts that are firing for each > interrupt status bit. > > Then, you'll want to write a sysctl for it. Have a look at > if_ath_sysctl.c for the SYSCTL_PROC() entries. Just write one that > when called will just printf() the intr_status array: > > for (i = 0; i < 32; i++) { > printf("interrupt %d count %u\n", i, intr_status[i]); > } > > Just make sure you do a complete kernel recompile as changing the > headers doesn't always force the source files to recompile. > > > -a > > > On 22 July 2014 12:08, Harm Weites wrote: >> Hi, >> >> My 1043nd is complaining about interrupt storms, presumably only when >> wifi is beeing used. When this occurs, networking is gone. >> >> The exact message thats flooding me: >> interrupt storm detected on "int2"; throttling interrupt source >> >> The device associated with int2 is arge0. >> >> Some possibly related logs, though these messages start at boot: >> >> # /sbin/dmesg | tail >> ath0: stuck beacon; resetting (bmiss count 4) >> ar5416StopDmaReceive: dma failed to stop in 10ms >> AR_CR=0x00000024 >> AR_DIAG_SW=0x42000020 >> MBSSID Set bit 22 of AR_STA_ID 0xb8c16866 >> ath0: stuck beacon; resetting (bmiss count 4) >> ar5416StopDmaReceive: dma failed to stop in 10ms >> AR_CR=0x00000024 >> AR_DIAG_SW=0x42000020 >> MBSSID Set bit 22 of AR_STA_ID 0xb8c16866 >> >> This unit is configured with (arge0) port0 bound to device vlan1, port4 >> to vlan2 and ports 1,2,3 make up vlan3. There is wlan0, bound to ath0 >> and a bridge device that connects wlan0 to vlan3. There is a dhcp server >> running in vlan3 to answer to wifi clients, internet is routed through >> vlan1. This initially works but after a little while the storm begins >> and the wifi client is left to die. >> >> Adrian@ suggested to start with reading which interrupt(s) occur(s), but >> that is perhaps a little to hard for me to code :) Looking at if_arge.c, >> it seems there is some debug code already in place (ARGEDEBUG()) though >> I'm not sure on how to use that. Reading from the AR71XX_DMA_INTR >> register and mapping its content to AR71XX_DMA_INTR_STATUS would be >> something I'd like to do with a separate program (instead of boldly >> taking a deepdive in to if_arge.c and recompiling/flashing untill >> something works). >> >> One of my other units is configured with just a vlan device per switch >> port, no wifi and no bridge. A third unit is configured with a wlan0, >> vlan1 (port0) and vlan2 (ports 1,2,3,4). Both not showing any issues in >> the past months. The only difference would be this problem-unit has a >> bridge. >> >> Any thoughts on how to approach or 'just' fix this? >> >> Regards, >> Harm >> _______________________________________________ >> freebsd-mips@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-mips >> To unsubscribe, send any mail to "freebsd-mips-unsubscribe@freebsd.org"