From owner-freebsd-arm@freebsd.org  Mon Jun 12 13:31:27 2017
Return-Path: <owner-freebsd-arm@freebsd.org>
Delivered-To: freebsd-arm@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 65997BFCB43
 for <freebsd-arm@mailman.ysv.freebsd.org>;
 Mon, 12 Jun 2017 13:31:27 +0000 (UTC)
 (envelope-from tvijlbrief@gmail.com)
Received: from mail-yw0-x22f.google.com (mail-yw0-x22f.google.com
 [IPv6:2607:f8b0:4002:c05::22f])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 209B66FB19
 for <freebsd-arm@freebsd.org>; Mon, 12 Jun 2017 13:31:27 +0000 (UTC)
 (envelope-from tvijlbrief@gmail.com)
Received: by mail-yw0-x22f.google.com with SMTP id e142so27490215ywa.1
 for <freebsd-arm@freebsd.org>; Mon, 12 Jun 2017 06:31:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
 bh=v6zg0dmKGNf6Ha6ASpCCYw9WCQccg8aVBkxC7M844Ao=;
 b=PvUkmOLFPGWW+ziKfpmJgJjE4QFgjK/kdOSR7C76oI6+UHziHVKhIEyFJsY7bWOQ8t
 G5W4zlS1rviOMCESXgpcKNv7ouHGN7DLwFKe4ckam2yj0ZbYyoY3/woFS+C/whutfo4q
 rYBPNESAz0SfaGnluF9Brt4YMPhYa9vJ/+ExyF6WLoOwarGxMHzaZR0qgIQWBFPjAgUD
 ZY7v7RALHUmQbv9mkPDgL0zjzgpo7Zw9Xkk+9au3eLPsAHO93rQv3cYz0H1y/N1H9W/v
 3LsiquOVaDTwqiRPb+KU/7TS8rzeMh1zoJRc2yLn12mzfJ0gr4IRSQ/0m0sxbtdnQB0k
 xq+A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to;
 bh=v6zg0dmKGNf6Ha6ASpCCYw9WCQccg8aVBkxC7M844Ao=;
 b=IK4z0VGP2nb1HTMPUDkAMh8PEWKtsiLoHltwjvKxZqGuGtuAGUSIR3ia9wnuh5Mliu
 ZOrngQ7JI4BMvxaU/rYecmrH6X7gw3pS1LlkPYDZh/RVThytlprkgilTIedimiYVXSuO
 QhFCqhg7ylauTztbd9UouYdM8L058Tor6iToiZVNvL/GPDyAulEwaLIvobGnecACeLj3
 xQVxuqEFOKCLSPPJG1yIxsCGpWz9f2fyq0JW7ZFGlZfRnhq22A23gMlQrKgqVC6qCyAG
 uXV0zM2A8NKdJ1pfStZKq7Rk+KS2TB9c5YAmz1tQVf592oNc7NPCa4v213gZkVTbzKog
 ragA==
X-Gm-Message-State: AODbwcAkYGV5s9JlxHzNAhEvagAumJLAaaidHU9rAgqOpmTvVgrnbWFt
 i55oxQ/FXtoXg0j6nLIp66h3Mlpw1g==
X-Received: by 10.129.76.14 with SMTP id z14mr24355294ywa.20.1497274286276;
 Mon, 12 Jun 2017 06:31:26 -0700 (PDT)
MIME-Version: 1.0
References: <bug-219927-7@https.bugs.freebsd.org/bugzilla/>
 <CAOQrpVfHqKwy8-fOAjrSM9-KQcV-Mya7uCNX6v44zSuVHHNTOQ@mail.gmail.com>
 <158994e0-9f53-e64a-2b81-b554894571c6@restart.be>
 <CAOQrpVf6apmyxt07wfkneRbTFq6v+N-r1E=hW6ykM1O+XJ3k1w@mail.gmail.com>
In-Reply-To: <CAOQrpVf6apmyxt07wfkneRbTFq6v+N-r1E=hW6ykM1O+XJ3k1w@mail.gmail.com>
From: Tom Vijlbrief <tvijlbrief@gmail.com>
Date: Mon, 12 Jun 2017 13:31:15 +0000
Message-ID: <CAOQrpVe54_WNAq6WRTdwfQwXeDCF7N1HYpmqmtLASsCAJgFeqw@mail.gmail.com>
Subject: Re: [Bug 219927] awg0 stops working after a long output under ssh
To: Henri Hennebert <hlh@restart.be>, freebsd-arm <freebsd-arm@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Porting FreeBSD to ARM processors." <freebsd-arm.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm/>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Jun 2017 13:31:27 -0000

Tested with TX_MAG_SEGS at 20 and that is also stable for me, so I added a
patch to the original bug report:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219927

The only downside I can see is a modest increase in kernel stack usage:

Index: sys/arm/allwinner/if_awg.c
===================================================================
--- sys/arm/allwinner/if_awg.c  (revision 319826)
+++ sys/arm/allwinner/if_awg.c  (working copy)
@@ -92,7 +92,7 @@
 #define        TX_SKIP(n, o)           (((n) + (o)) & (TX_DESC_COUNT - 1))
 #define        RX_NEXT(n)              (((n) + 1) & (RX_DESC_COUNT - 1))

-#define        TX_MAX_SEGS             10
+#define        TX_MAX_SEGS             20

 #define        SOFT_RST_RETRY          1000
 #define        MII_BUSY_RETRY          1000
@@ -419,14 +419,18 @@
            sc->tx.buf_map[index].map, m, segs, &nsegs, BUS_DMA_NOWAIT);
        if (error == EFBIG) {
                m = m_collapse(m, M_NOWAIT, TX_MAX_SEGS);
-               if (m == NULL)
+               if (m == NULL) {
+                       device_printf(sc->miibus, "awg_setup_txbuf:
m_collapse failed\n");
                        return (0);
+               }
                *mp = m;
                error = bus_dmamap_load_mbuf_sg(sc->tx.buf_tag,
                    sc->tx.buf_map[index].map, m, segs, &nsegs,
BUS_DMA_NOWAIT);
        }
-       if (error != 0)
+       if (error != 0) {
+               device_printf(sc->miibus, "awg_setup_txbuf:
bus_dmamap_load_mbuf_sg failed\n");
                return (0);
+       }

        bus_dmamap_sync(sc->tx.buf_tag, sc->tx.buf_map[index].map,
            BUS_DMASYNC_PREWRITE);


Op ma 12 jun. 2017 om 10:47 schreef Tom Vijlbrief <tvijlbrief@gmail.com>:

>
>
> Op ma 12 jun. 2017 09:59 schreef Henri Hennebert <hlh@restart.be>:
>
>> On 06/11/2017 17:54, Tom Vijlbrief wrote:
>> >
>> > Op zo 11 jun. 2017 om 16:23 schreef <bugzilla-noreply@freebsd.org
>> > <mailto:bugzilla-noreply@freebsd.org>>:
>> >
>> >     https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219927
>> >
>> >                  Bug ID: 219927
>> >                 Summary: awg0 stops working after a long output under
>> ssh
>> >                 Product: Base System
>> >                 Version: CURRENT
>> >                Hardware: arm64
>> >                      OS: Any
>> >                  Status: New
>> >                Severity: Affects Only Me
>> >                Priority: ---
>> >               Component: arm
>> >                Assignee: freebsd-arm@FreeBSD.org
>> >                Reporter: hlh@restart.be <mailto:hlh@restart.be>
>> >
>> >     Environment: pine64+ 2GB
>> >     FreeBSD norquay.restart.bel 12.0-CURRENT FreeBSD 12.0-CURRENT #0
>> >     r318945M: Sat
>> >     Jun 10 11:47:44 CEST 2017
>> >     root@norquay.restart.bel:/usr/obj/usr/src/sys/NORQUAY  arm64
>> >
>> >     If I connect from a wireless computer (FreeBSD 11.1-PRERELEASE #0
>> >     r318860) and
>> >     run a command with a big output (eg `find /`) the awg0 stops working
>> >     quickly
>> >     (under 20 seconds of output).
>> >
>> >     If I do the same with telnet from the same computer, the output is
>> >     much longer
>> >     but awg0 stops working.
>> >
>> >     If I do the same from a wired computer then I must run `find /` 2 or
>> >     3 times
>> >     before awg0 stops working.
>> >
>> >     I can rsync through ssh 12GB without problem in both directions
>> >     (from and to
>> >     the pine64 and the wireless computer).
>> >
>> >     I have a `tcpdump -w ssh.data port 22`. (8.3 MB)
>> >
>> >     I can connect with a serial console to the pine64 after awg0 stop
>> >     working.
>> >     ifconfig awg0 down
>> >     ifconfig awg0 up
>> >     don't restore the connectivity. I must reboot to restore
>> connectvity.
>> >
>> >
>> > That's a coincidence, today I'm investigating the same issue.
>> >
>> > You could try increasing TX_MAX_SEGS in  sys/arm/allwinner/if_awg.c
>> line 95.
>> >
>> > I'm currently testing TX_MAX_SEGS set to 40 and no lock up yet....
>>
>> Bingo. Your solution solved the problem.
>>
>> Thanks a lot.
>>
>
> Good to hear!
>
> Increasing from 10 to 20 is probably sufficient. It is not clear to me
> what the adverse effects are of a too high value.
>
> The root cause is that the driver tries to call m_collapse with this limit
> and this will fail. The tcp stack will resent the package and the
> m_collapse will fail again and again and ...
>
>