Date: Mon, 4 Feb 2013 15:24:58 -0500 From: John Baldwin <jhb@freebsd.org> To: freebsd-net@freebsd.org Cc: Robert Watson <rwatson@freebsd.org>, Kip Macy <kmacy@freebsd.org>, Randy Stewart <randall@lakerest.net>, jv@freebsd.org Subject: Re: Driver patch to look at... Message-ID: <201302041524.58699.jhb@freebsd.org> In-Reply-To: <D3AA078A-CD19-4228-A019-BE9C985895E2@lakerest.net> References: <D3AA078A-CD19-4228-A019-BE9C985895E2@lakerest.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, February 04, 2013 12:22:49 pm Randy Stewart wrote: > All: > > I have been working with TCP in gigabit networks (igb driver actually) and have > found a very nasty problem with the way the driver is doing its put back when > it fills the out-bound transmit queue. > > Basically it has taken a packet from the head of the ring buffer, and then > realizes it can't fit it into the transmit queue. So it just re-enqueue's it > into the ring buffer. Whats wrong with that? Well most of the time there > are anywhere from 10-50 packets (maybe more) in that ring buffer when you are > operating at full speed (or trying to). This means you will see 10 duplicate > ACKs, do a fast retransmit and cut your cwnd in half.. not very nice actually. > > The patch I have attached makes it so that > > 1) There are ways to swap back. > 2) Use the peek in the ring buffer and only > dequeue the packet if we put it into the transmit ring > 3) If something goes wrong and the transmit frees the packet we dequeue it. > 4) If the transmit changed it (defrag etc) then swap out the new mbuf that > has taken its place. > > I have fixed the four intel drivers that had this systemic issue, but there > are still more to fix. > > Comments/review .. rotten egg's etc.. would be most welcome before > I commit this.. Does this only happen in drivers that use bufring? I seem to recall that drivers using IFQ would just stuff the packet at the head of the IFQ via IFQ_DRV_PREPEND() in this case so it is still the next packet to transmit. See, for example, this bit in dc_start_locked(): for (queued = 0; !IFQ_DRV_IS_EMPTY(&ifp->if_snd); ) { /* * If there's no way we can send any packets, return now. */ if (sc->dc_cdata.dc_tx_cnt > DC_TX_LIST_CNT - DC_TX_LIST_RSVD) { ifp->if_drv_flags |= IFF_DRV_OACTIVE; break; } IFQ_DRV_DEQUEUE(&ifp->if_snd, m_head); if (m_head == NULL) break; if (dc_encap(sc, &m_head)) { if (m_head == NULL) break; IFQ_DRV_PREPEND(&ifp->if_snd, m_head); ifp->if_drv_flags |= IFF_DRV_OACTIVE; break; } It sounds like drbr/buf_ring just don't handle this case correctly? It seems a shame to have to duplicate so much code in the various drivers to fix this, but that seems to be par for the course when using buf_ring. :( (buggy in edge cases and lots of duplicated code that is). Also, doing the drbr_swap() just so that drbr_dequeue() returns what you just swapped in seems... odd. It seems that it would be nicer instead to have some sort of drbr_peek() / drbr_advance() where the latter just skips over whatever the current head is? Then you could have something like: while ((next = drbr_peek()) != NULL) { if (!foo_encap(&next)) { if (next == NULL) drbr_advance(); break; } drbr_advance(); } I guess the sticky widget is the case of ATLQ as you need to dequeue the packet always in the ALTQ case and put it back if the encap fails. For your patch it's not clear to me how that works. It seems that if the encap routine frees the mbuf you try to dereference a freed pointer when you call drbr_dequeue(). I really think you will instead need some sort of 'drbr_putback()' and have 'drbr_peek()' dequeue in the ALTQ case and use 'drbr_putback()' to put it back (PREPEND) in the ALTQ case. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201302041524.58699.jhb>