Date: Tue, 17 Dec 2013 17:39:45 +0100 From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de> To: Adrian Chadd <adrian@freebsd.org> Cc: Jack F Vogel <jfv@freebsd.org>, freebsd-net <freebsd-net@freebsd.org>, Ryan Stone <rysto32@gmail.com> Subject: Re: buf_ring in HEAD is racy Message-ID: <E5612C2C-C9B9-45D8-A794-9993FE39D9E6@lurchi.franken.de> In-Reply-To: <CAJ-Vmo=iayVJiMNihmPf_Dd18uEZ5W-j9E3g4SeaRhfFJQj4Ow@mail.gmail.com> References: <CAFMmRNyJpvZ0AewWr62w16=qKer%2BFNXUJJy0Qc=EBqMnUV3OyQ@mail.gmail.com> <CAJ-VmonJG-M_f_m36f-z3ArumBKNdt5%2B6muwFJzWABRxRQWJaw@mail.gmail.com> <CAJ-Vmo=OpWZ_OphA=%2BXbmgHw%2BFwuCMwngHq9_2WOnNHMa_RA9A@mail.gmail.com> <CAJ-Vmo=iayVJiMNihmPf_Dd18uEZ5W-j9E3g4SeaRhfFJQj4Ow@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Dec 17, 2013, at 4:43 PM, Adrian Chadd <adrian@freebsd.org> wrote: > ie: >=20 > Index: sys/dev/ixgbe/ixgbe.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/dev/ixgbe/ixgbe.c (revision 2995) > +++ sys/dev/ixgbe/ixgbe.c (working copy) > @@ -5178,6 +5178,7 @@ > struct ixgbe_hw *hw =3D &adapter->hw; > u32 missed_rx =3D 0, bprc, lxon, lxoff, total; > u64 total_missed_rx =3D 0; > + u64 odrops; What about=20 + u64 odrops =3D 0; since odrops seems to be used uninitialized. Best regards Michael >=20 > adapter->stats.crcerrs +=3D IXGBE_READ_REG(hw, IXGBE_CRCERRS); > adapter->stats.illerrc +=3D IXGBE_READ_REG(hw, IXGBE_ILLERRC); > @@ -5308,6 +5309,11 @@ > adapter->stats.fcoedwtc +=3D IXGBE_READ_REG(hw, = IXGBE_FCOEDWTC); > } >=20 > + /* TX drops */ > + for (int i =3D 0; i < adapter->num_queues; i++) { > + odrops +=3D adapter->tx_rings[i].br->br_drops; > + } > + > /* Fill out the OS statistics structure */ > ifp->if_ipackets =3D adapter->stats.gprc; > ifp->if_opackets =3D adapter->stats.gptc; > @@ -5317,6 +5323,9 @@ > ifp->if_omcasts =3D adapter->stats.mptc; > ifp->if_collisions =3D 0; >=20 > + /* TX drops are stored in if_snd for now, not the top level = counters */ > + ifp->if_snd.ifq_drops =3D drops; > + > /* Rx Errors */ > ifp->if_iqdrops =3D total_missed_rx; > ifp->if_ierrors =3D adapter->stats.crcerrs + = adapter->stats.rlec; >=20 >=20 >=20 >=20 > -adrian >=20 > On 16 December 2013 12:19, Adrian Chadd <adrian@freebsd.org> wrote: >> So I've done some digging. >>=20 >> The TL;DR - we really should expose the drops count in the ifa_data >> field. But that would involve an ABI change and .. well, that will be >> full of drama. >>=20 >> The non-TL:DR: >>=20 >> * The send queue drops are in the ifmibdata structure, which includes >> a few other things. But that requires you use the MIB interface to >> pull things out, rather than getifaddrs(). >> * getifaddrs() doesn't contain the max sendq length or drops, so we >> can't report those without adding a MIB fetch for the relevant >> interface. >> * bsnmp (which we use at work) correctly populates output discards by >> checking the MIB for ifmd_snd_drops. >>=20 >> So I'd rather we fix ixgbe and other drivers by updating the send >> queue drops counter if the frames are dropped. That way it's >> consistent with other things. >>=20 >> We should then do some of these things: >>=20 >> * add a send queue drops field to the ifdata/ifadata where = appropriate >> and make sure it gets correctly populated; >> * teach netstat to use ifmib instead of getifaddrs(); >> * as part of killing the ifsnd queue stuff, we should also migrate >> that particular drops counter out from there and to a top-level >> counter in ifnet, like the rest. >>=20 >> Comments? >>=20 >>=20 >> -a >>=20 >>=20 >> On 14 December 2013 16:06, Adrian Chadd <adrian@freebsd.org> wrote: >>> oh cool, you just did the output-drops thing I was about to code up. >>> We're missing those counters at work and the ops guys poked me about >>> it. >>>=20 >>> I'll also give that a whirl locally and see about working with jack = to >>> get it into -HEAD / MFC'ed to 10. >>>=20 >>> Thanks! >>>=20 >>>=20 >>> -a >>>=20 >>>=20 >>> On 13 December 2013 21:04, Ryan Stone <rysto32@gmail.com> wrote: >>>> I am seeing spurious output packet drops that appear to be due to >>>> insufficient memory barriers in buf_ring. I believe that this is = the >>>> scenario that I am seeing: >>>>=20 >>>> 1) The buf_ring is empty, br_prod_head =3D br_cons_head =3D 0 >>>> 2) Thread 1 attempts to enqueue an mbuf on the buf_ring. It = fetches >>>> br_prod_head (0) into a local variable called prod_head >>>> 3) Thread 2 enqueues an mbuf on the buf_ring. The sequence of = events >>>> is essentially: >>>>=20 >>>> Thread 2 claims an index in the ring and atomically sets = br_prod_head (say to 1) >>>> Thread 2 sets br_ring[1] =3D mbuf; >>>> Thread 2 does a full memory barrier >>>> Thread 2 updates br_prod_tail to 1 >>>>=20 >>>> 4) Thread 2 dequeues the packet from the buf_ring using the >>>> single-consumer interface. The sequence of events is essentialy: >>>>=20 >>>> Thread 2 checks whether queue is empty (br_cons_head =3D=3D = br_prod_tail), >>>> this is false >>>> Thread 2 sets br_cons_head to 1 >>>> Thread 2 grabs the mbuf from br_ring[1] >>>> Thread 2 sets br_cons_tail to 1 >>>>=20 >>>> 5) Thread 1, which is still attempting to enqueue an mbuf on the = ring. >>>> fetches br_cons_tail (1) into a local variable called cons_tail. = It >>>> sees cons_tail =3D=3D 1 but prod_head =3D=3D 0 and concludes that = the ring is >>>> full and drops the packet (incrementing br_drops unatomically, I = might >>>> add) >>>>=20 >>>>=20 >>>> I can reproduce several drops per minute by configuring the ixgbe >>>> driver to use only 1 queue and then sending traffic from concurrent = 8 >>>> iperf processes. (You will need this hacky patch to even see the >>>> drops with netstat, though: >>>> http://people.freebsd.org/~rstone/patches/ixgbe_br_drops.diff) >>>>=20 >>>> I am investigating fixing buf_ring by using acquire/release = semantics >>>> rather than load/store barriers. However, I note that this will >>>> apparently be the second attempt to fix buf_ring, and I'm seriously >>>> questioning whether this is worth the effort compared to the >>>> simplicity of using a mutex. I'm not even convinced that a correct >>>> lockless implementation will even be a performance win, given the >>>> number of memory barriers that will apparently be necessary. >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E5612C2C-C9B9-45D8-A794-9993FE39D9E6>