Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 May 2012 19:21:21 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        net@freebsd.org
Cc:        jfv@freebsd.org, Jack Vogel <jfvogel@gmail.com>, John Baldwin <jhb@freebsd.org>
Subject:   Re: 82574L hangs (with r233708 e1000 driver).
Message-ID:  <20120501162121.GV2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <20120412183849.GA2358@deviant.kiev.zoral.com.ua>
References:  <20120407133715.GU2358@deviant.kiev.zoral.com.ua> <CAFOYbc=hFg_jvohPVQrp4M%2BXQztoO6b-9Pop=PrVn6VxP6oaHQ@mail.gmail.com> <20120408051125.GA2358@deviant.kiev.zoral.com.ua> <201204091219.39580.jhb@freebsd.org> <20120412183849.GA2358@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

--qBJg7ibC5PRC7M9v
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Apr 12, 2012 at 09:38:49PM +0300, Konstantin Belousov wrote:
> On Mon, Apr 09, 2012 at 12:19:39PM -0400, John Baldwin wrote:
> > On Sunday, April 08, 2012 1:11:25 am Konstantin Belousov wrote:
> > > On Sat, Apr 07, 2012 at 04:22:07PM -0700, Jack Vogel wrote:
> > > > Make sure you have any firmware up to the latest available, if that=
 doesn't
> > > > help
> > > > let me know and I'll check internally to see if there are any outst=
anding
> > > > issues
> > > > in shared code,  that will be after the weekend.
> > >=20
> > > I had BIOS rev. 151, after you hint I found rev. 154 on the site.
> > > Now BIOS reports itself as MTCDT10N.86A.0154.2012.0323.1601,
> > > March 23.
> > >=20
> > > Unfortunately, upgrade did not changed anything in regard of hanging
> > > interface.
> >=20
> > Does reverting 233708 make any difference?  Have you tried futzing arou=
nd with
> > kgdb when it is hung to see what state the device is in (software state=
 at
> > least)?
> It does, in a sense that without r233708 the interface becomes stuck
> almost immediately. I just upgraded to the e1000@r234154, which does not
> change much.
>=20
> I fiddled with the adapter state after the hang in kgdb more, and I
> noted something interesting. Apparently, tx works. When I ping the remote
> host from my suffering atom machine, remote host sees the packet. Also
> remote machine sees some udp traffic originating from the tom, like
> ntp queries.
>=20
> And, on receive, the atom board does receive interrupts, em0:rx 0 counter
> in vmstat -i increases. Even more fun, the sysctl dev.em.0.debug
> shows increasing hw rdh (as I understand, this is hardware 'last
> received' packet pointer for rx ring). So I looked at the packet
> descriptor at hw rdt index, and there I see
> (kgdb) p/x ((struct adapter *)0xffffff80010e4000)->rx_rings->rx_base[78]
> $11 =3D {buffer_addr =3D 0x12a128800, length =3D 0x5ea, csum =3D 0x3c2b, =
status =3D 0x0,=20
>   errors =3D 0x0, special =3D 0x0}
>=20
> Apparently, the Descriptor Done bit is clear, so the em_rxeof() function
> breaks from the loop, not consuming the current packet. Also, it returns
> false due to DD bit clear. This prevents em_msix_rx() from scheduling
> taskqueue for processing. So apparent cause for the hang is missing
> DD bit in descriptor.
>=20
> I am not sure isn't all this is obvious for anybody who knows em
> internals, and were to go from there.

Ok, nobody cares.

Below is the workaround I use to prevent the interface wedging.
It seems that the sole PCI register read (namely, the rx ring head read)
and consequent recheck of the descriptor status greatly reduce the
likelihood of the issue. Unfortunately, the read does not eliminate
the hang completely. So it is not some PCIe coherency problem.

With the patch applied, I am able to copy around blu-ray images, while
previously the interface hang in 20-30 seconds of 100Mbit/s traffic.
Sometimes the messages are printed:
em0: Workaround: head 1018 tail 1002 cur 1010
em0: Workaround: head 976 tail 973 cur 974
em0: Workaround: head 950 tail 939 cur 946
em0: Workaround: head 435 tail 419 cur 426

Machine is still dead due to random memory corruption which I see, in
particular, pmap sometimes read garbage from PTEs. I have no idea is
it related to em0 rx descriptor missed writes, or is a different issue.

diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index 9c31dad..b2e76cc 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -335,6 +335,11 @@ MODULE_DEPEND(em, ether, 1, 1, 1);
=20
 static SYSCTL_NODE(_hw, OID_AUTO, em, CTLFLAG_RD, 0, "EM driver parameters=
");
=20
+static int em_rx_workaround;
+TUNABLE_INT("hw.em.rx_workaround", &em_rx_workaround);
+SYSCTL_INT(_hw_em, OID_AUTO, rx_workaround, CTLFLAG_RDTUN, &em_rx_workarou=
nd,
+    0, "Activate workaround for missed DD in rx ring");
+
 static int em_tx_int_delay_dflt =3D EM_TICKS_TO_USECS(EM_TIDV);
 static int em_rx_int_delay_dflt =3D EM_TICKS_TO_USECS(EM_RDTR);
 TUNABLE_INT("hw.em.tx_int_delay", &em_tx_int_delay_dflt);
@@ -4422,14 +4427,60 @@ em_rxeof(struct rx_ring *rxr, int count, int *done)
 		status =3D cur->status;
 		mp =3D sendmp =3D NULL;
=20
-		if ((status & E1000_RXD_STAT_DD) =3D=3D 0)
+		if ((status & E1000_RXD_STAT_DD) =3D=3D 0) {
+			/*
+			 * From PCI/PCI-X Family of Gigabit Ethernet
+			 * Controllers Software Developer's Manual,
+			 * rev. 2.5, p. 306.
+			 *
+			 * Reading the descriptor head to determine
+			 * which buffers are finished is not reliable.
+			 */
+			if (em_rx_workaround) {
+				int head, next;
+
+				head =3D E1000_READ_REG(&adapter->hw,
+				    E1000_RDH(rxr->me));
+				next =3D i + 1;
+				if (next =3D=3D adapter->num_rx_desc)
+					next =3D 0;
+				if (next =3D=3D head)
+					break;
+				/*
+				 * Re-read the status for the typical
+				 * case of head advanced due to
+				 * received packet.
+				 */
+				status =3D cur->status;
+				if ((status & E1000_RXD_STAT_DD) !=3D 0)
+					continue;
+				/*
+				 * Be extra-paranoid and only activate
+				 * the workaround if the next
+				 * descriptor in the rx ring has the
+				 * Descriptor Done bit set, which
+				 * clearly indicates missed write to
+				 * the current descriptor status.
+				 */
+				if ((rxr->rx_base[next].status &
+				    E1000_RXD_STAT_DD) =3D=3D 0)
+					break;
+				eop =3D 1; /* XXX */
+				device_printf(adapter->dev,
+			    "Workaround: head %d tail %d cur %d\n",
+				    head, E1000_READ_REG(&adapter->hw,
+				    E1000_RDT(rxr->me)), i);
+				goto boom;
+			}
 			break;
+		}
=20
 		len =3D le16toh(cur->length);
 		eop =3D (status & E1000_RXD_STAT_EOP) !=3D 0;
=20
 		if ((cur->errors & E1000_RXD_ERR_FRAME_ERR_MASK) ||
 		    (rxr->discard =3D=3D TRUE)) {
+boom:
 			ifp->if_ierrors++;
 			++rxr->rx_discarded;
 			if (!eop) /* Catch subsequent segs */

--qBJg7ibC5PRC7M9v
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk+gDYEACgkQC3+MBN1Mb4hRNgCeL3KZ+GwLgQ6GFG6Cb6Du46bO
JJEAmgNkddwterImVF2VjVwmBcqg9cpd
=lTt9
-----END PGP SIGNATURE-----

--qBJg7ibC5PRC7M9v--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120501162121.GV2358>