Date: Fri, 27 Apr 2012 15:20:17 -0700 From: YongHyeon PYUN <pyunyh@gmail.com> To: John Baldwin <jhb@freebsd.org> Cc: Pavel Gorshkov <gorshkov.pavel@gmail.com>, freebsd-stable@freebsd.org Subject: Re: msk0: interrupt storm Message-ID: <20120427222017.GA17009@michelle.cdnetworks.com> In-Reply-To: <201204250935.28973.jhb@freebsd.org> References: <20120228210329.GA2741@localhost> <20120424200719.GB6932@michelle.cdnetworks.com> <201204241507.14699.jhb@freebsd.org> <201204250935.28973.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--oyUTqETQ0mS9luUI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Apr 25, 2012 at 09:35:28AM -0400, John Baldwin wrote: > On Tuesday, April 24, 2012 3:07:14 pm John Baldwin wrote: > > On Tuesday, April 24, 2012 4:07:19 pm YongHyeon PYUN wrote: > > > On Mon, Apr 23, 2012 at 10:24:41AM -0400, John Baldwin wrote: > > > > On Wednesday, March 07, 2012 3:40:53 pm YongHyeon PYUN wrote: > > > > > On Tue, Mar 06, 2012 at 10:36:05AM -0500, John Baldwin wrote: > > > > > > On Thursday, March 01, 2012 8:29:55 pm YongHyeon PYUN wrote: > > > > > > > On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote: > > > > > > > > My laptop running 9.0-RELEASE/amd64/GENERIC freezes and > > > > > > > > (sometimes) unfreezes intermittently, logging the following: > > > > > > > > > > > > > > > > Feb 28 23:07:36 lifebook kernel: interrupt storm detected on > > "irq259:"; > > > > > > throttling interrupt source > > > > > > > > > > > > > > > > $ vmstat -i > > > > > > > > ... > > > > > > > > irq259: mskc0 11669511 3456 > > > > > > > > > > > > > > > > > > > > > > > > Looks very similar to this: > > > > > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=164569 > > > > > > > > > > > > > > > > Any suggestions? > > > > > > > > > > > > > > Try disabling MSI and see whether that makes any difference. > > > > > > > > > > > > I also get interrupt storms with msk. They do fix themselves when > > they > > > > > > happen, and I've seen it happen with the machine is idle. This is on > > my > > > > > > little netbook where msk had several problems initially that have > > since been > > > > > > fixed. > > > > > > > > > > > > mskc0: <Marvell Yukon 88E8072 Gigabit Ethernet> port 0x2000-0x20ff mem > > > > > > 0xe0000000-0xe0003fff irq 19 at device 0.0 on pci32 > > > > > > msk0: <Marvell Technology Group Ltd. Yukon EX Id 0xb5 Rev 0x02> on > > mskc0 > > > > > > msk0: Ethernet address: 00:24:81:40:e3:ef > > > > > > miibus0: <MII bus> on msk0 > > > > > > e1000phy0: <Marvell 88E1149 Gigabit PHY> PHY 0 on miibus0 > > > > > > e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, > > 1000baseT, > > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > > > > > > > > mskc0@pci0:32:0:0: class=0x020000 card=0x3056103c chip=0x436c11ab > > > > > > rev=0x10 hdr=0x00 > > > > > > vendor = 'Marvell Technology Group Ltd.' > > > > > > device = '88E8072 PCI-E Gigabit Ethernet Controller' > > > > > > class = network > > > > > > subclass = ethernet > > > > > > > > > > > > > > > > John, can you let me know the value of B0_Y2_SP_ISRC2 register in > > > > > interrupt handler when you see the interrupt storm? > > > > > > > > I finally tested this. I added some KTR traces to dump ISRC2 on each > > > > call to msk_intr() and hacked the interrupt thread code to turn KTR > > tracing > > > > off when a storm occurred. The traces look like this: > > > > > > > > index cpu timestamp trace > > > > ------ --- ---------------- ----- > > > > 148 0 111662766108828 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 147 0 111662765994576 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 146 0 111662765380260 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 145 0 111662765257308 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 144 0 111662765134356 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 143 0 111662765011560 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 142 0 111662764888656 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 141 0 111662764773924 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 140 0 111662764659360 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 139 0 111662764528140 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 138 0 111662764413576 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > 137 0 111662764287852 msk_intr: B0_Y2_SP_ISRC2 = 0x44000000 > > > > ... > > > > > > > > (All traces have the same register value.) The TSC on this netbook runs > > > > at machdep.tsc_freq: 1596035244 > > > > > > > > (The timestamps above are TSC values.) > > > > > > > > Let me know if you'd like me to log more stuff in the driver. Thanks! > > > > > > wonder why the deivce gets TWSI completion interrupt since the > > > driver does not monitor temperature sensor. In addition, the > > > interrupt was already disabled so have no idea how this can happen. > > > Here, I assume your controller implemented optional temperature > > > sensor and it is monitored by H/W. > > > Anyway, try attached patch and let me know whether it makes any > > > difference. > > > > It does fix the interrupt storms. I added a debugging printf to fire each > > time msk_intr() sees this bit to see if it storms, etc. What I see is that > > each time I would previously get a single printf reporting an interrupt storm, > > I now get a single printf reporting that the TWSI_RDY bit was set. > > Sadly, I spoke too soon. With this patch applied I got another storm last night > where this bit was not set during the storm: > > index cpu timestamp trace > ------ --- ---------------- ----- > 71 0 36775451301480 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 70 0 36775450145436 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 69 0 36775449956940 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 68 0 36775449768564 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 67 0 36775448604912 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 66 0 36775448416872 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 65 0 36775448220444 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 64 0 36775446569772 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 63 0 36775445385804 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 62 0 36775445189340 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 61 0 36775444984476 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 60 0 36775443829368 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 59 0 36775443640920 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 58 0 36775443444324 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > 57 0 36775442272836 msk_intr: B0_Y2_SP_ISRC2 = 0x40000000 > > Apr 24 14:50:53 pipkin kernel: mskc0: TWSI ready > Apr 24 14:50:54 pipkin kernel: mskc0: TWSI ready > Apr 24 20:19:49 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source > Apr 24 21:40:14 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source > Apr 25 05:19:31 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source > Apr 25 08:04:38 pipkin kernel: interrupt storm detected on "irq257:"; throttling interrupt source > > (This trace was from the storm at 20:19:49.) > Hmm, would you give attach patch try? Yukon Extreme seems to have a new flow control feature but I'm not sure whether this is related with interrupt storm you're seeing. --oyUTqETQ0mS9luUI Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="msk.flowctl.diff" Index: sys/dev/msk/if_msk.c =================================================================== --- sys/dev/msk/if_msk.c (revision 234725) +++ sys/dev/msk/if_msk.c (working copy) @@ -3863,6 +3863,11 @@ gmac = DATA_BLIND_VAL(DATA_BLIND_DEF) | GM_SMOD_VLAN_ENA | IPG_DATA_VAL(IPG_DATA_DEF); + if ((sc->msk_hw_id == CHIP_ID_YUKON_EC_U && + sc->msk_hw_rev == CHIP_REV_YU_EC_U_B1) || + (sc->msk_hw_id == CHIP_ID_YUKON_EX && + sc->msk_hw_rev != CHIP_REV_YU_EX_A0)) + gmac |= GM_NEW_FLOW_CTRL; if (ifp->if_mtu > ETHERMTU) gmac |= GM_SMOD_JUMBO_ENA; GMAC_WRITE_2(sc, sc_if->msk_port, GM_SERIAL_MODE, gmac); Index: sys/dev/msk/if_mskreg.h =================================================================== --- sys/dev/msk/if_mskreg.h (revision 234725) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -918,6 +918,8 @@ #define CHIP_REV_YU_EC_U_A0 1 #define CHIP_REV_YU_EC_U_A1 2 +#define CHIP_REV_YU_EC_U_B0 3 +#define CHIP_REV_YU_EC_U_B1 5 #define CHIP_REV_YU_FE_P_A0 0 /* Chip Rev. for Yukon-2 FE+ A0 */ @@ -1881,6 +1883,7 @@ #define GM_SMOD_LIMIT_4 BIT_10 /* 4 consecutive Tx trials */ #define GM_SMOD_VLAN_ENA BIT_9 /* Enable VLAN (Max. Frame Len) */ #define GM_SMOD_JUMBO_ENA BIT_8 /* Enable Jumbo (Max. Frame Len) */ +#define GM_NEW_FLOW_CTRL BIT_6 /* Enable New Flow-Control */ #define GM_SMOD_IPG_MSK 0x1f /* Bit 4.. 0: Inter-Packet Gap (IPG) */ #define DATA_BLIND_VAL(x) (SHIFT11(x) & GM_SMOD_DATABL_MSK) --oyUTqETQ0mS9luUI--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120427222017.GA17009>