From owner-svn-src-stable-9@FreeBSD.ORG Mon Nov 25 20:01:35 2013 Return-Path: Delivered-To: svn-src-stable-9@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 77514724; Mon, 25 Nov 2013 20:01:35 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 65E492590; Mon, 25 Nov 2013 20:01:35 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id rAPK1ZuT014006; Mon, 25 Nov 2013 20:01:35 GMT (envelope-from delphij@svn.freebsd.org) Received: (from delphij@localhost) by svn.freebsd.org (8.14.7/8.14.5/Submit) id rAPK1YXX014000; Mon, 25 Nov 2013 20:01:34 GMT (envelope-from delphij@svn.freebsd.org) Message-Id: <201311252001.rAPK1YXX014000@svn.freebsd.org> From: Xin LI Date: Mon, 25 Nov 2013 20:01:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r258584 - in stable/9/sys: dev/bxe dev/oce ofed/drivers/net/mlx4 X-SVN-Group: stable-9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-9@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: SVN commit messages for only the 9-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Nov 2013 20:01:35 -0000 Author: delphij Date: Mon Nov 25 20:01:34 2013 New Revision: 258584 URL: http://svnweb.freebsd.org/changeset/base/258584 Log: MFC r246482 (rrs) + r246581: This fixes a out-of-order problem with several of the newer drivers. The basic problem was that the driver was pulling the mbuf off the drbr ring and then when sending with xmit(), encounting a full transmit ring. Thus the lower layer xmit() function would return an error, and the drivers would then append the data back on to the ring. For TCP this is a horrible scenario sure to bring on a fast-retransmit. The fix is to use drbr_peek() to pull the data pointer but not remove it from the ring. If it fails then we either call the new drbr_putback or drbr_advance method. Advance moves it forward (we do this sometimes when the xmit() function frees the mbuf). When we succeed we always call advance. The putback will always copy the mbuf back to the top of the ring. Note that the putback *cannot* be used with a drbr_dequeue() only with drbr_peek(). We most of the time, in putback, would not need to copy it back since most likey the mbuf is still the same, but sometimes xmit() functions will change the mbuf via a pullup or other call. So the optimial case for the single consumer is to always copy it back. If we ever do a multiple_consumer (for lagg?) we will need a test and atomic in the put back possibly a seperate putback_mc() in the ring buf. Reviewed by: jhb@freebsd.org, jlv@freebsd.org Modified: stable/9/sys/dev/bxe/if_bxe.c stable/9/sys/dev/oce/oce_if.c stable/9/sys/ofed/drivers/net/mlx4/en_tx.c Directory Properties: stable/9/sys/ (props changed) stable/9/sys/dev/ (props changed) Modified: stable/9/sys/dev/bxe/if_bxe.c ============================================================================== --- stable/9/sys/dev/bxe/if_bxe.c Mon Nov 25 19:53:46 2013 (r258583) +++ stable/9/sys/dev/bxe/if_bxe.c Mon Nov 25 20:01:34 2013 (r258584) @@ -9501,24 +9501,15 @@ bxe_tx_mq_start_locked(struct ifnet *ifp BXE_FP_LOCK_ASSERT(fp); - if (m == NULL) { - /* No new work, check for pending frames. */ - next = drbr_dequeue(ifp, fp->br); - } else if (drbr_needs_enqueue(ifp, fp->br)) { - /* Both new and pending work, maintain packet order. */ + if (m != NULL) { rc = drbr_enqueue(ifp, fp->br, m); if (rc != 0) { fp->tx_soft_errors++; goto bxe_tx_mq_start_locked_exit; } - next = drbr_dequeue(ifp, fp->br); - } else - /* New work only, nothing pending. */ - next = m; - + } /* Keep adding entries while there are frames to send. */ - while (next != NULL) { - + while ((next = drbr_peek(ifp, fp->br)) != NULL) { /* The transmit mbuf now belongs to us, keep track of it. */ fp->tx_mbuf_alloc++; @@ -9532,23 +9523,22 @@ bxe_tx_mq_start_locked(struct ifnet *ifp if (__predict_false(rc != 0)) { fp->tx_encap_failures++; /* Very Bad Frames(tm) may have been dropped. */ - if (next != NULL) { + if (next == NULL) { + drbr_advance(ifp, fp->br); + } else { + drbr_putback(ifp, fp->br, next); /* * Mark the TX queue as full and save * the frame. */ ifp->if_drv_flags |= IFF_DRV_OACTIVE; fp->tx_frame_deferred++; - - /* This may reorder frame. */ - rc = drbr_enqueue(ifp, fp->br, next); fp->tx_mbuf_alloc--; } - /* Stop looking for more work. */ break; } - + drbr_advance(ifp, fp->br); /* The transmit frame was enqueued successfully. */ tx_count++; @@ -9569,8 +9559,6 @@ bxe_tx_mq_start_locked(struct ifnet *ifp ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; break; } - - next = drbr_dequeue(ifp, fp->br); } /* No TX packets were dequeued. */ Modified: stable/9/sys/dev/oce/oce_if.c ============================================================================== --- stable/9/sys/dev/oce/oce_if.c Mon Nov 25 19:53:46 2013 (r258583) +++ stable/9/sys/dev/oce/oce_if.c Mon Nov 25 20:01:34 2013 (r258584) @@ -1173,29 +1173,27 @@ oce_multiq_transmit(struct ifnet *ifp, s return status; } - if (m == NULL) - next = drbr_dequeue(ifp, br); - else if (drbr_needs_enqueue(ifp, br)) { + if (m != NULL) { if ((status = drbr_enqueue(ifp, br, m)) != 0) return status; - next = drbr_dequeue(ifp, br); - } else - next = m; - - while (next != NULL) { + } + while ((next = drbr_peek(ifp, br)) != NULL) { if (oce_tx(sc, &next, queue_index)) { - if (next != NULL) { + if (next == NULL) { + drbr_advance(ifp, br); + } else { + drbr_putback(ifp, br, next); wq->tx_stats.tx_stops ++; ifp->if_drv_flags |= IFF_DRV_OACTIVE; status = drbr_enqueue(ifp, br, next); } break; } + drbr_advance(ifp, br); ifp->if_obytes += next->m_pkthdr.len; if (next->m_flags & M_MCAST) ifp->if_omcasts++; ETHER_BPF_MTAP(ifp, next); - next = drbr_dequeue(ifp, br); } return status; Modified: stable/9/sys/ofed/drivers/net/mlx4/en_tx.c ============================================================================== --- stable/9/sys/ofed/drivers/net/mlx4/en_tx.c Mon Nov 25 19:53:46 2013 (r258583) +++ stable/9/sys/ofed/drivers/net/mlx4/en_tx.c Mon Nov 25 20:01:34 2013 (r258584) @@ -935,22 +935,21 @@ mlx4_en_transmit_locked(struct ifnet *de } enqueued = 0; - if (m == NULL) { - next = drbr_dequeue(dev, ring->br); - } else if (drbr_needs_enqueue(dev, ring->br)) { + if (m != NULL) { if ((err = drbr_enqueue(dev, ring->br, m)) != 0) return (err); - next = drbr_dequeue(dev, ring->br); - } else - next = m; - + } /* Process the queue */ - while (next != NULL) { + while ((next = drbr_peek(dev, ring->br)) != NULL) { if ((err = mlx4_en_xmit(dev, tx_ind, &next)) != 0) { - if (next != NULL) - err = drbr_enqueue(dev, ring->br, next); + if (next == NULL) { + drbr_advance(dev, ring->br); + } else { + drbr_putback(dev, ring->br, next); + } break; } + drbr_advance(dev, ring->br); enqueued++; dev->if_obytes += next->m_pkthdr.len; if (next->m_flags & M_MCAST) @@ -959,7 +958,6 @@ mlx4_en_transmit_locked(struct ifnet *de ETHER_BPF_MTAP(dev, next); if ((dev->if_drv_flags & IFF_DRV_RUNNING) == 0) break; - next = drbr_dequeue(dev, ring->br); } if (enqueued > 0)