Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Apr 2012 15:20:17 -0700
From:      YongHyeon PYUN <pyunyh@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Pavel Gorshkov <gorshkov.pavel@gmail.com>, freebsd-stable@freebsd.org
Subject:   Re: msk0: interrupt storm
Message-ID:  <20120427222017.GA17009@michelle.cdnetworks.com>
In-Reply-To: <201204250935.28973.jhb@freebsd.org>
References:  <20120228210329.GA2741@localhost> <20120424200719.GB6932@michelle.cdnetworks.com> <201204241507.14699.jhb@freebsd.org> <201204250935.28973.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--oyUTqETQ0mS9luUI
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Wed, Apr 25, 2012 at 09:35:28AM -0400, John Baldwin wrote:
> On Tuesday, April 24, 2012 3:07:14 pm John Baldwin wrote:
> > On Tuesday, April 24, 2012 4:07:19 pm YongHyeon PYUN wrote:
> > > On Mon, Apr 23, 2012 at 10:24:41AM -0400, John Baldwin wrote:
> > > > On Wednesday, March 07, 2012 3:40:53 pm YongHyeon PYUN wrote:
> > > > > On Tue, Mar 06, 2012 at 10:36:05AM -0500, John Baldwin wrote:
> > > > > > On Thursday, March 01, 2012 8:29:55 pm YongHyeon PYUN wrote:
> > > > > > > On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote:
> > > > > > > > My laptop running 9.0-RELEASE/amd64/GENERIC freezes and
> > > > > > > > (sometimes) unfreezes intermittently, logging the following:
> > > > > > > > 
> > > > > > > > Feb 28 23:07:36 lifebook kernel: interrupt storm detected on 
> > "irq259:"; 
> > > > > > throttling interrupt source
> > > > > > > > 
> > > > > > > > $ vmstat -i
> > > > > > > > ...
> > > > > > > > irq259: mskc0                   11669511       3456
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Looks very similar to this:
> > > > > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=164569
> > > > > > > > 
> > > > > > > > Any suggestions?
> > > > > > > 
> > > > > > > Try disabling MSI and see whether that makes any difference.
> > > > > > 
> > > > > > I also get interrupt storms with msk.  They do fix themselves when 
> > they 
> > > > > > happen, and I've seen it happen with the machine is idle.  This is on 
> > my 
> > > > > > little netbook where msk had several problems initially that have 
> > since been 
> > > > > > fixed.
> > > > > > 
> > > > > > mskc0: <Marvell Yukon 88E8072 Gigabit Ethernet> port 0x2000-0x20ff mem 
> > > > > > 0xe0000000-0xe0003fff irq 19 at device 0.0 on pci32
> > > > > > msk0: <Marvell Technology Group Ltd. Yukon EX Id 0xb5 Rev 0x02> on 
> > mskc0
> > > > > > msk0: Ethernet address: 00:24:81:40:e3:ef
> > > > > > miibus0: <MII bus> on msk0
> > > > > > e1000phy0: <Marvell 88E1149 Gigabit PHY> PHY 0 on miibus0
> > > > > > e1000phy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
> > 1000baseT, 
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > > > 
> > > > > > mskc0@pci0:32:0:0:      class=0x020000 card=0x3056103c chip=0x436c11ab 
> > > > > > rev=0x10 hdr=0x00
> > > > > >     vendor     = 'Marvell Technology Group Ltd.'
> > > > > >     device     = '88E8072 PCI-E Gigabit Ethernet Controller'
> > > > > >     class      = network
> > > > > >     subclass   = ethernet
> > > > > > 
> > > > > 
> > > > > John, can you let me know the value of B0_Y2_SP_ISRC2 register in
> > > > > interrupt handler when you see the interrupt storm?
> > > > 
> > > > I finally tested this.  I added some KTR traces to dump ISRC2 on each
> > > > call to msk_intr() and hacked the interrupt thread code to turn KTR 
> > tracing
> > > > off when a storm occurred.  The traces look like this:
> > > > 
> > > > index  cpu timestamp        trace
> > > > ------ --- ---------------- ----- 
> > > >    148   0  111662766108828 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    147   0  111662765994576 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    146   0  111662765380260 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    145   0  111662765257308 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    144   0  111662765134356 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    143   0  111662765011560 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    142   0  111662764888656 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    141   0  111662764773924 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    140   0  111662764659360 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    139   0  111662764528140 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    138   0  111662764413576 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > >    137   0  111662764287852 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000
> > > > ...
> > > > 
> > > > (All traces have the same register value.)  The TSC on this netbook runs
> > > > at machdep.tsc_freq: 1596035244
> > > > 
> > > > (The timestamps above are TSC values.)
> > > > 
> > > > Let me know if you'd like me to log more stuff in the driver.  Thanks!
> > > 
> > >  wonder why the deivce gets TWSI completion interrupt since the
> > > driver does not monitor temperature sensor. In addition, the
> > > interrupt was already disabled so have no idea how this can happen.
> > > Here, I assume your controller implemented optional temperature
> > > sensor and it is monitored by H/W.
> > > Anyway, try attached patch and let me know whether it makes any
> > > difference.
> > 
> > It does fix the interrupt storms.  I added a debugging printf to fire each 
> > time msk_intr() sees this bit to see if it storms, etc.  What I see is that 
> > each time I would previously get a single printf reporting an interrupt storm, 
> > I now get a single printf reporting that the TWSI_RDY bit was set.
> 
> Sadly, I spoke too soon.  With this patch applied I got another storm last night
> where this bit was not set during the storm:
> 
> index  cpu timestamp        trace
> ------ --- ---------------- ----- 
>     71   0   36775451301480 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     70   0   36775450145436 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     69   0   36775449956940 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     68   0   36775449768564 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     67   0   36775448604912 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     66   0   36775448416872 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     65   0   36775448220444 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     64   0   36775446569772 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     63   0   36775445385804 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     62   0   36775445189340 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     61   0   36775444984476 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     60   0   36775443829368 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     59   0   36775443640920 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     58   0   36775443444324 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
>     57   0   36775442272836 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000
> 
> Apr 24 14:50:53 pipkin kernel: mskc0: TWSI ready
> Apr 24 14:50:54 pipkin kernel: mskc0: TWSI ready
> Apr 24 20:19:49 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source
> Apr 24 21:40:14 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source
> Apr 25 05:19:31 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source
> Apr 25 08:04:38 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source
> 
> (This trace was from the storm at 20:19:49.)
> 

Hmm, would you give attach patch try?
Yukon Extreme seems to have a new flow control feature but I'm not
sure whether this is related with interrupt storm you're seeing.

--oyUTqETQ0mS9luUI
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="msk.flowctl.diff"

Index: sys/dev/msk/if_msk.c
===================================================================
--- sys/dev/msk/if_msk.c	(revision 234725)
+++ sys/dev/msk/if_msk.c	(working copy)
@@ -3863,6 +3863,11 @@
 	gmac = DATA_BLIND_VAL(DATA_BLIND_DEF) |
 	    GM_SMOD_VLAN_ENA | IPG_DATA_VAL(IPG_DATA_DEF);
 
+	if ((sc->msk_hw_id == CHIP_ID_YUKON_EC_U &&
+	    sc->msk_hw_rev == CHIP_REV_YU_EC_U_B1) ||
+	    (sc->msk_hw_id == CHIP_ID_YUKON_EX &&
+	    sc->msk_hw_rev != CHIP_REV_YU_EX_A0))
+		gmac |= GM_NEW_FLOW_CTRL;
 	if (ifp->if_mtu > ETHERMTU)
 		gmac |= GM_SMOD_JUMBO_ENA;
 	GMAC_WRITE_2(sc, sc_if->msk_port, GM_SERIAL_MODE, gmac);
Index: sys/dev/msk/if_mskreg.h
===================================================================
--- sys/dev/msk/if_mskreg.h	(revision 234725)
+++ sys/dev/msk/if_mskreg.h	(working copy)
@@ -918,6 +918,8 @@
 
 #define	CHIP_REV_YU_EC_U_A0	1
 #define	CHIP_REV_YU_EC_U_A1	2
+#define	CHIP_REV_YU_EC_U_B0	3
+#define	CHIP_REV_YU_EC_U_B1	5
 
 #define	CHIP_REV_YU_FE_P_A0	0 /* Chip Rev. for Yukon-2 FE+ A0 */
 
@@ -1881,6 +1883,7 @@
 #define GM_SMOD_LIMIT_4		BIT_10	/* 4 consecutive Tx trials */
 #define GM_SMOD_VLAN_ENA	BIT_9	/* Enable VLAN  (Max. Frame Len) */
 #define GM_SMOD_JUMBO_ENA	BIT_8	/* Enable Jumbo (Max. Frame Len) */
+#define GM_NEW_FLOW_CTRL	BIT_6	/* Enable New Flow-Control */
 #define GM_SMOD_IPG_MSK		0x1f	/* Bit  4.. 0:	Inter-Packet Gap (IPG) */
 
 #define DATA_BLIND_VAL(x)	(SHIFT11(x) & GM_SMOD_DATABL_MSK)

--oyUTqETQ0mS9luUI--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120427222017.GA17009>