From owner-freebsd-stable@FreeBSD.ORG Mon Jan 6 04:30:48 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ACEF3A13 for ; Mon, 6 Jan 2014 04:30:48 +0000 (UTC) Received: from maildrop2.v6ds.occnc.com (maildrop2.v6ds.occnc.com [IPv6:2001:470:88e6:3::232]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 676EF1359 for ; Mon, 6 Jan 2014 04:30:48 +0000 (UTC) Received: from harbor3.ipv6.occnc.com (harbor3.v6ds.occnc.com [IPv6:2001:470:88e6:3::239]) (authenticated bits=128) by maildrop2.v6ds.occnc.com (8.14.7/8.14.7) with ESMTP id s064UjCG090668; Sun, 5 Jan 2014 23:30:45 -0500 (EST) (envelope-from curtis@ipv6.occnc.com) Message-Id: <201401060430.s064UjCG090668@maildrop2.v6ds.occnc.com> To: curtis@ipv6.occnc.com From: Curtis Villamizar Subject: Re: regression: msk0 watchdog timeout and interrupt storm In-reply-to: Your message of "Wed, 01 Jan 2014 16:44:57 -0500." <201401012144.s01LivSi099164@maildrop2.v6ds.occnc.com> Date: Sun, 05 Jan 2014 23:30:45 -0500 Cc: Yonghyeon PYUN , freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: curtis@ipv6.occnc.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jan 2014 04:30:48 -0000 Pyun, Replying to self since I did not get your reply but saw it on the stable10 mailing list archive. I pasted in your responses so its really a reply to you. Sorry for the delay to your email on Jan 2. I had some email trouble (self induced by DNS change) that should be fixed now. Curtis In message <201401012144.s01LivSi099164@maildrop2.v6ds.occnc.com> Curtis Villamizar writes: > > > > Replying to self (and top posting). > > > > I'm not sure if the problem is fixed or masked. > > > > The symptom (watchdog and interrupt storm) has gone away with the > > following change in if_mskreg.h: > > > > @@ -2329,8 +2329,13 @@ > > */ > > #if (BUS_SPACE_MAXADDR > 0xFFFFFFFF) > > #define MSK_64BIT_DMA > > +#if 1 > > +#define MSK_TX_RING_CNT 256 > > +#define MSK_RX_RING_CNT 256 > > +#else > > #define MSK_TX_RING_CNT 384 > > #define MSK_RX_RING_CNT 512 > > +#endif > > #else > > #undef MSK_64BIT_DMA > > #define MSK_TX_RING_CNT 256 > > > > This backs out a very small part of the change made to if_mskreg.h in > > revision 227582. > > > > The following is what I think is affected by this change: > > > > count = imin(4096, roundup2(count, 1024)); > > sc->msk_stat_count = count; > > stat_sz = count * sizeof(struct msk_stat_desc); > > > > The change makes count end up being 1024 (and stat_sz 8192). > > > > For me the problem is fixed/masked but I would also consider putting > > the increase to MSK_TX_RING_CNT and MSK_RX_RING_CNT back and forcing > > count above to be no greater than 1024 if that would help someone else > > debug the problem. I'm not sure where the 4096 came from but > > replacing that with 1024 is equivalent to "count = 1024" with no math > > involved. > > Marvell calls DMA descriptors as LEs. The maximum number of status > LEs supported by controller is 4096 and it should be large enough > to hold status LE update(for dual-port controllers, the status > DMA block is shared between each port). Yes. I am aware of this, but regardless I ran into this bug and forcing MSK_TX_RING_CNT and MSK_RX_RING_CNT removed the symptom. > > This does seem to me like a regression in 10.0 caused by the change to > > if_mskreg.h (Nov 16). The workaround so far has been fine for me. > > If you revert the change made in r258790, does the issue go away? > Are you running amd64? Because you touched #if (BUS_SPACE_MAXADDR > > 0xFFFFFFFF) block in if_mskreg.h I guess you're running amd64 but > I need confirmation. If your system have more than 4GB memory on > amd64, could you reduce amount of available memory to be less than > 4GB?(i.e. set hw.physmem in loader.conf) > Also would you show me dmesg(8) output(msk(4) and e1000phy(4) only) > to know exact Yukon controller model? Yes it is AMD64. uname -m amd64 CPU: AMD Athlon(tm) II X2 B24 Processor (2992.58-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x100f63 Family = 0x10 Model = 0x6 Stepping = 3 Features=0x178bfbff Features2=0x802009 AMD Features=0xee500800 AMD Features2=0x37ff TSC: P-state invariant pciconf -lcv [...] mskc0@pci0:2:0:0: class=0x020000 card=0x305817aa chip=0x438011ab rev=0x10 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8057 PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[130] = Serial 1 ef3856ffffdc9cc8 Please let me know what I could do to help debug this. I did not back out the change entirely (yet). I only effectively backed out the change to the two constants MSK_TX_RING_CNT and MSK_RX_RING_CNT and that was enough to make the problem go away. Curtis > > Curtis > > > > > > In message <201401010153.s011rNcm082703@maildrop2.v6ds.occnc.com> > > Curtis Villamizar writes: > > > > > > I'm getting an interrupt storm from mskc running with the latest > > > if_msk.c code. The OS is built from source (259540): > > > > > > FreeBSD 10.0-PRERELEASE (GENERIC) #0 r259540: Sat Dec 21 00:05:39 EST 2013 > > > > > > While not the latest, the point is that sys/dev/msk is up to date wrt > > > stable_9 and also wrt head. > > > > > > The odd thing is that the machine seemed to run fine for a day or two > > > and then started exhibiting this behaviour and has become useless. > > > > > > This is now highly reproducible (it happens within seconds when trying > > > to do a long file transfer between two machines with GbE) so if there > > > is anything I can do to instrument this, please make suggestions. > > > > > > What I know so far is: > > > > > > 1. When the watchdog occurs, Y2_IS_STAT_BMU is set in the prior > > > interrupt mask. > > > > > > 2. This would put us in from msk_intr into msk_handle_events, with > > > msk_handle_events returning 0. > > > > > > 3. msk_handle_events reads in sc->msk_stat_cons. The last recorded > > > value of sc->msk_stat_cons is alway 1024. > > > > > > 4. The only way to exit msk_handle_events with sc->msk_stat_cons > > > greater than zero yet not do anything is hit the top of loop > > > conditional and fall out: > > > > > > sd = &sc->msk_stat_ring[cons]; > > > control = le32toh(sd->msk_control); > > > if ((control & HW_OWNER) == 0) > > > break; > > > > > > 5. The code after the loop can return zero if the ring buffer > > > pointer hasn't moved. That code is: > > > > > > sc->msk_stat_cons = cons; > > > bus_dmamap_sync(sc->msk_stat_tag, sc->msk_stat_map, > > > BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); > > > > > > if (rxput[MSK_PORT_A] > 0) > > > msk_rxput(sc->msk_if[MSK_PORT_A]); > > > if (rxput[MSK_PORT_B] > 0) > > > msk_rxput(sc->msk_if[MSK_PORT_B]); > > > > > > return (sc->msk_stat_cons != CSR_READ_2(sc, STAT_PUT_IDX)); > > > > > > 6. If the return value is zero, the interrupt isn't cleared. That > > > was suspect. The code in msk_intr is: > > > > > > domore = msk_handle_events(sc); > > > if ((status & Y2_IS_STAT_BMU) != 0 && domore == 0) > > > CSR_WRITE_4(sc, STAT_CTRL, SC_STAT_CLR_IRQ); > > > > > > 7. This code before the return in msk_handle_events should force > > > the clear but doesn't fix anything. > > > > > > if ((control & HW_OWNER) == 0) > > > return; > > > > > > This looks like some sort of fall off the end of a ring buffer type of > > > problem (since it always points to entry 0x400) but since I haven't > > > done driver work in ages, that is mostly just a wild guess and I > > > really have no idea yet at to what is going wrong. > > > > > > Also please keep me on the Cc since I'm not subscribed to the list, > > > though I will check the archives from time to time. > > > > > > Thanks, > > > > > > Curtis > > > > > > > > > reference: > > > http://lists.freebsd.org/pipermail/freebsd-stable/2013-November/075699.html