Date: Sat, 14 Dec 2013 00:04:57 -0500 From: Ryan Stone <rysto32@gmail.com> To: freebsd-net <freebsd-net@freebsd.org> Subject: buf_ring in HEAD is racy Message-ID: <CAFMmRNyJpvZ0AewWr62w16=qKer%2BFNXUJJy0Qc=EBqMnUV3OyQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
I am seeing spurious output packet drops that appear to be due to insufficient memory barriers in buf_ring. I believe that this is the scenario that I am seeing: 1) The buf_ring is empty, br_prod_head = br_cons_head = 0 2) Thread 1 attempts to enqueue an mbuf on the buf_ring. It fetches br_prod_head (0) into a local variable called prod_head 3) Thread 2 enqueues an mbuf on the buf_ring. The sequence of events is essentially: Thread 2 claims an index in the ring and atomically sets br_prod_head (say to 1) Thread 2 sets br_ring[1] = mbuf; Thread 2 does a full memory barrier Thread 2 updates br_prod_tail to 1 4) Thread 2 dequeues the packet from the buf_ring using the single-consumer interface. The sequence of events is essentialy: Thread 2 checks whether queue is empty (br_cons_head == br_prod_tail), this is false Thread 2 sets br_cons_head to 1 Thread 2 grabs the mbuf from br_ring[1] Thread 2 sets br_cons_tail to 1 5) Thread 1, which is still attempting to enqueue an mbuf on the ring. fetches br_cons_tail (1) into a local variable called cons_tail. It sees cons_tail == 1 but prod_head == 0 and concludes that the ring is full and drops the packet (incrementing br_drops unatomically, I might add) I can reproduce several drops per minute by configuring the ixgbe driver to use only 1 queue and then sending traffic from concurrent 8 iperf processes. (You will need this hacky patch to even see the drops with netstat, though: http://people.freebsd.org/~rstone/patches/ixgbe_br_drops.diff) I am investigating fixing buf_ring by using acquire/release semantics rather than load/store barriers. However, I note that this will apparently be the second attempt to fix buf_ring, and I'm seriously questioning whether this is worth the effort compared to the simplicity of using a mutex. I'm not even convinced that a correct lockless implementation will even be a performance win, given the number of memory barriers that will apparently be necessary.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNyJpvZ0AewWr62w16=qKer%2BFNXUJJy0Qc=EBqMnUV3OyQ>