Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Oct 2015 04:15:00 +0700
From:      Eugene Grosbein <eugen@grosbein.net>
To:        Adrian Chadd <adrian.chadd@gmail.com>
Cc:        "freebsd-mips@freebsd.org" <freebsd-mips@freebsd.org>
Subject:   Re: arge1 on TL WDR3600
Message-ID:  <56328C54.1050709@grosbein.net>
In-Reply-To: <CAJ-Vmo=iD1TdWdPU91TdKL8oW42C_fXUODeigTj55_xJF86AvA@mail.gmail.com>
References:  <562CBEC3.8030308@rdtc.ru> <CAJ-Vmok__9mD8OaFnU-sfVfr=xMRMW6-nfDUHScT_LNm6Ry2iA@mail.gmail.com> <562E3027.4020806@grosbein.net> <CAJ-VmonRt6OVOQDGLZBx-4OxbGgzcetuKtBf3eB-6yn3m-EEsQ@mail.gmail.com> <562F75E2.9000505@grosbein.net> <CAJ-VmomocPQ=%2BjKYt8bsLHEWjT1vz=37U_yNB3YMsmxz__5qVw@mail.gmail.com> <CAJ-Vmo=BRP-vyg5=7cyA9v9c_cDjo6Ozv0SLmNj3RZGCKjLYAg@mail.gmail.com> <CAJ-VmokD2vHZ0%2BzO655_csRQw==JUDbaBCDMa%2BU7b1aRv=4BJQ@mail.gmail.com> <5630E844.2080807@grosbein.net> <CAJ-VmonH%2BVfT1zUyAq=fXv6PbwQuiw1_k4CRw1yMgfm6CRaAwA@mail.gmail.com> <CAJ-VmomjWOccaaVyPrGBrf7ACL8KGGOuuFo1QWw02%2BM9smVGFA@mail.gmail.com> <56321ED9.4050602@grosbein.net> <CAJ-Vmom1Tagn6WL-qfNZ7xqPznrLygB6JzMMJdyLU=ROybnEGA@mail.gmail.com> <56323496.609@grosbein.net> <CAJ-Vmond--pm8-rjn55qD8thjneaNiL0eV89Opd1u8p%2BK3BF3w@mail.gmail.com> <56325769.8070202@grosbein.net> <CAJ-Vmo=iD1TdWdPU91TdKL8oW42C_fXUODeigTj55_xJF86AvA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 30.10.2015 01:41, Adrian Chadd wrote:
> AH, ok. So it says TX_UNDERRUN + TX_PKT_SENT. So hm.
> 
> The way this is supposed to work is .. odd.
> 
> You queue TX packets to the hardware. The hardware increments
> TXPKTCOUNT in the TX DMA status register.
> 
> Then for each packet you see transmitted, you write TXPKTSENT to the
> TX DMA status register and that decrements TXPKTCOUNT. Once it's zero,
> you won't see any more TX interrupts.
> 
> Now, the 'arge_tx_cnt' value tracks that; it should be zero if it's
> idle. It's 126, which means there are still things to process. there's
> 128 ring slots, so that prod/cons value indicates there's 126 things
> in the ring. arge_tx_locked() does the decrementing and poking
> TXPKTSENT.
> 
> So, I bet the driver and hardware is out of sync. I bet that the (ctrl
> & ARGE_DESC_EMPTY) check is triggering on the current frame in that
> ring. I don't know if it's because prod/cons are out of whack, or it's
> currently trying to check a descriptor in a multi-frame TX descriptor,
> or whether the hardware is just plain buggy and it didn't update that.
> But, that's actually what's going on.
> 
> So that's my 5 minute analysis of it. I wish I could reproduce it on
> what I have here because then I could see what the state of the ring
> is and whether the hardware is buggy or our tx prod/cons tracking is
> busted. I'd reaally appreciate help here :(

I do not think that hardware is buggy because several versions of "official"
TP-Link firmware work just fine with this particular device and earlier
versions of FreeBSD 11 also work without interrupt storms.

In fact, I've applied dichotomy and found guilty commit. It took 12 iterations
but here is it: kernel built using head at r289897 runs my test without a storm
(but forwarding speed is pretty bad) and kernel built with r289898 or later
revisions has interrupt storm.

This is the change:
https://svnweb.freebsd.org/base/head/sys/mips/atheros/if_arge.c?r1=289744&r2=289898&view=patch

My device has hw.model: Atheros AR9344 rev 2





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56328C54.1050709>