Date: Tue, 17 Dec 2013 07:43:49 -0800 From: Adrian Chadd <adrian@freebsd.org> To: Ryan Stone <rysto32@gmail.com>, Jack F Vogel <jfv@freebsd.org> Cc: freebsd-net <freebsd-net@freebsd.org> Subject: Re: buf_ring in HEAD is racy Message-ID: <CAJ-Vmo=iayVJiMNihmPf_Dd18uEZ5W-j9E3g4SeaRhfFJQj4Ow@mail.gmail.com> In-Reply-To: <CAJ-Vmo=OpWZ_OphA=%2BXbmgHw%2BFwuCMwngHq9_2WOnNHMa_RA9A@mail.gmail.com> References: <CAFMmRNyJpvZ0AewWr62w16=qKer%2BFNXUJJy0Qc=EBqMnUV3OyQ@mail.gmail.com> <CAJ-VmonJG-M_f_m36f-z3ArumBKNdt5%2B6muwFJzWABRxRQWJaw@mail.gmail.com> <CAJ-Vmo=OpWZ_OphA=%2BXbmgHw%2BFwuCMwngHq9_2WOnNHMa_RA9A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
ie: Index: sys/dev/ixgbe/ixgbe.c =================================================================== --- sys/dev/ixgbe/ixgbe.c (revision 2995) +++ sys/dev/ixgbe/ixgbe.c (working copy) @@ -5178,6 +5178,7 @@ struct ixgbe_hw *hw = &adapter->hw; u32 missed_rx = 0, bprc, lxon, lxoff, total; u64 total_missed_rx = 0; + u64 odrops; adapter->stats.crcerrs += IXGBE_READ_REG(hw, IXGBE_CRCERRS); adapter->stats.illerrc += IXGBE_READ_REG(hw, IXGBE_ILLERRC); @@ -5308,6 +5309,11 @@ adapter->stats.fcoedwtc += IXGBE_READ_REG(hw, IXGBE_FCOEDWTC); } + /* TX drops */ + for (int i = 0; i < adapter->num_queues; i++) { + odrops += adapter->tx_rings[i].br->br_drops; + } + /* Fill out the OS statistics structure */ ifp->if_ipackets = adapter->stats.gprc; ifp->if_opackets = adapter->stats.gptc; @@ -5317,6 +5323,9 @@ ifp->if_omcasts = adapter->stats.mptc; ifp->if_collisions = 0; + /* TX drops are stored in if_snd for now, not the top level counters */ + ifp->if_snd.ifq_drops = drops; + /* Rx Errors */ ifp->if_iqdrops = total_missed_rx; ifp->if_ierrors = adapter->stats.crcerrs + adapter->stats.rlec; -adrian On 16 December 2013 12:19, Adrian Chadd <adrian@freebsd.org> wrote: > So I've done some digging. > > The TL;DR - we really should expose the drops count in the ifa_data > field. But that would involve an ABI change and .. well, that will be > full of drama. > > The non-TL:DR: > > * The send queue drops are in the ifmibdata structure, which includes > a few other things. But that requires you use the MIB interface to > pull things out, rather than getifaddrs(). > * getifaddrs() doesn't contain the max sendq length or drops, so we > can't report those without adding a MIB fetch for the relevant > interface. > * bsnmp (which we use at work) correctly populates output discards by > checking the MIB for ifmd_snd_drops. > > So I'd rather we fix ixgbe and other drivers by updating the send > queue drops counter if the frames are dropped. That way it's > consistent with other things. > > We should then do some of these things: > > * add a send queue drops field to the ifdata/ifadata where appropriate > and make sure it gets correctly populated; > * teach netstat to use ifmib instead of getifaddrs(); > * as part of killing the ifsnd queue stuff, we should also migrate > that particular drops counter out from there and to a top-level > counter in ifnet, like the rest. > > Comments? > > > -a > > > On 14 December 2013 16:06, Adrian Chadd <adrian@freebsd.org> wrote: >> oh cool, you just did the output-drops thing I was about to code up. >> We're missing those counters at work and the ops guys poked me about >> it. >> >> I'll also give that a whirl locally and see about working with jack to >> get it into -HEAD / MFC'ed to 10. >> >> Thanks! >> >> >> -a >> >> >> On 13 December 2013 21:04, Ryan Stone <rysto32@gmail.com> wrote: >>> I am seeing spurious output packet drops that appear to be due to >>> insufficient memory barriers in buf_ring. I believe that this is the >>> scenario that I am seeing: >>> >>> 1) The buf_ring is empty, br_prod_head = br_cons_head = 0 >>> 2) Thread 1 attempts to enqueue an mbuf on the buf_ring. It fetches >>> br_prod_head (0) into a local variable called prod_head >>> 3) Thread 2 enqueues an mbuf on the buf_ring. The sequence of events >>> is essentially: >>> >>> Thread 2 claims an index in the ring and atomically sets br_prod_head (say to 1) >>> Thread 2 sets br_ring[1] = mbuf; >>> Thread 2 does a full memory barrier >>> Thread 2 updates br_prod_tail to 1 >>> >>> 4) Thread 2 dequeues the packet from the buf_ring using the >>> single-consumer interface. The sequence of events is essentialy: >>> >>> Thread 2 checks whether queue is empty (br_cons_head == br_prod_tail), >>> this is false >>> Thread 2 sets br_cons_head to 1 >>> Thread 2 grabs the mbuf from br_ring[1] >>> Thread 2 sets br_cons_tail to 1 >>> >>> 5) Thread 1, which is still attempting to enqueue an mbuf on the ring. >>> fetches br_cons_tail (1) into a local variable called cons_tail. It >>> sees cons_tail == 1 but prod_head == 0 and concludes that the ring is >>> full and drops the packet (incrementing br_drops unatomically, I might >>> add) >>> >>> >>> I can reproduce several drops per minute by configuring the ixgbe >>> driver to use only 1 queue and then sending traffic from concurrent 8 >>> iperf processes. (You will need this hacky patch to even see the >>> drops with netstat, though: >>> http://people.freebsd.org/~rstone/patches/ixgbe_br_drops.diff) >>> >>> I am investigating fixing buf_ring by using acquire/release semantics >>> rather than load/store barriers. However, I note that this will >>> apparently be the second attempt to fix buf_ring, and I'm seriously >>> questioning whether this is worth the effort compared to the >>> simplicity of using a mutex. I'm not even convinced that a correct >>> lockless implementation will even be a performance win, given the >>> number of memory barriers that will apparently be necessary. >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=iayVJiMNihmPf_Dd18uEZ5W-j9E3g4SeaRhfFJQj4Ow>