Date: Mon, 12 Jun 2017 13:31:15 +0000 From: Tom Vijlbrief <tvijlbrief@gmail.com> To: Henri Hennebert <hlh@restart.be>, freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: [Bug 219927] awg0 stops working after a long output under ssh Message-ID: <CAOQrpVe54_WNAq6WRTdwfQwXeDCF7N1HYpmqmtLASsCAJgFeqw@mail.gmail.com> In-Reply-To: <CAOQrpVf6apmyxt07wfkneRbTFq6v%2BN-r1E=hW6ykM1O%2BXJ3k1w@mail.gmail.com> References: <bug-219927-7@https.bugs.freebsd.org/bugzilla/> <CAOQrpVfHqKwy8-fOAjrSM9-KQcV-Mya7uCNX6v44zSuVHHNTOQ@mail.gmail.com> <158994e0-9f53-e64a-2b81-b554894571c6@restart.be> <CAOQrpVf6apmyxt07wfkneRbTFq6v%2BN-r1E=hW6ykM1O%2BXJ3k1w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Tested with TX_MAG_SEGS at 20 and that is also stable for me, so I added a patch to the original bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219927 The only downside I can see is a modest increase in kernel stack usage: Index: sys/arm/allwinner/if_awg.c =================================================================== --- sys/arm/allwinner/if_awg.c (revision 319826) +++ sys/arm/allwinner/if_awg.c (working copy) @@ -92,7 +92,7 @@ #define TX_SKIP(n, o) (((n) + (o)) & (TX_DESC_COUNT - 1)) #define RX_NEXT(n) (((n) + 1) & (RX_DESC_COUNT - 1)) -#define TX_MAX_SEGS 10 +#define TX_MAX_SEGS 20 #define SOFT_RST_RETRY 1000 #define MII_BUSY_RETRY 1000 @@ -419,14 +419,18 @@ sc->tx.buf_map[index].map, m, segs, &nsegs, BUS_DMA_NOWAIT); if (error == EFBIG) { m = m_collapse(m, M_NOWAIT, TX_MAX_SEGS); - if (m == NULL) + if (m == NULL) { + device_printf(sc->miibus, "awg_setup_txbuf: m_collapse failed\n"); return (0); + } *mp = m; error = bus_dmamap_load_mbuf_sg(sc->tx.buf_tag, sc->tx.buf_map[index].map, m, segs, &nsegs, BUS_DMA_NOWAIT); } - if (error != 0) + if (error != 0) { + device_printf(sc->miibus, "awg_setup_txbuf: bus_dmamap_load_mbuf_sg failed\n"); return (0); + } bus_dmamap_sync(sc->tx.buf_tag, sc->tx.buf_map[index].map, BUS_DMASYNC_PREWRITE); Op ma 12 jun. 2017 om 10:47 schreef Tom Vijlbrief <tvijlbrief@gmail.com>: > > > Op ma 12 jun. 2017 09:59 schreef Henri Hennebert <hlh@restart.be>: > >> On 06/11/2017 17:54, Tom Vijlbrief wrote: >> > >> > Op zo 11 jun. 2017 om 16:23 schreef <bugzilla-noreply@freebsd.org >> > <mailto:bugzilla-noreply@freebsd.org>>: >> > >> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219927 >> > >> > Bug ID: 219927 >> > Summary: awg0 stops working after a long output under >> ssh >> > Product: Base System >> > Version: CURRENT >> > Hardware: arm64 >> > OS: Any >> > Status: New >> > Severity: Affects Only Me >> > Priority: --- >> > Component: arm >> > Assignee: freebsd-arm@FreeBSD.org >> > Reporter: hlh@restart.be <mailto:hlh@restart.be> >> > >> > Environment: pine64+ 2GB >> > FreeBSD norquay.restart.bel 12.0-CURRENT FreeBSD 12.0-CURRENT #0 >> > r318945M: Sat >> > Jun 10 11:47:44 CEST 2017 >> > root@norquay.restart.bel:/usr/obj/usr/src/sys/NORQUAY arm64 >> > >> > If I connect from a wireless computer (FreeBSD 11.1-PRERELEASE #0 >> > r318860) and >> > run a command with a big output (eg `find /`) the awg0 stops working >> > quickly >> > (under 20 seconds of output). >> > >> > If I do the same with telnet from the same computer, the output is >> > much longer >> > but awg0 stops working. >> > >> > If I do the same from a wired computer then I must run `find /` 2 or >> > 3 times >> > before awg0 stops working. >> > >> > I can rsync through ssh 12GB without problem in both directions >> > (from and to >> > the pine64 and the wireless computer). >> > >> > I have a `tcpdump -w ssh.data port 22`. (8.3 MB) >> > >> > I can connect with a serial console to the pine64 after awg0 stop >> > working. >> > ifconfig awg0 down >> > ifconfig awg0 up >> > don't restore the connectivity. I must reboot to restore >> connectvity. >> > >> > >> > That's a coincidence, today I'm investigating the same issue. >> > >> > You could try increasing TX_MAX_SEGS in sys/arm/allwinner/if_awg.c >> line 95. >> > >> > I'm currently testing TX_MAX_SEGS set to 40 and no lock up yet.... >> >> Bingo. Your solution solved the problem. >> >> Thanks a lot. >> > > Good to hear! > > Increasing from 10 to 20 is probably sufficient. It is not clear to me > what the adverse effects are of a too high value. > > The root cause is that the driver tries to call m_collapse with this limit > and this will fail. The tcp stack will resent the package and the > m_collapse will fail again and again and ... > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOQrpVe54_WNAq6WRTdwfQwXeDCF7N1HYpmqmtLASsCAJgFeqw>