From owner-svn-src-user@FreeBSD.ORG Wed Aug 14 20:20:43 2013 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E40E76E1; Wed, 14 Aug 2013 20:20:42 +0000 (UTC) (envelope-from np@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D29D6242F; Wed, 14 Aug 2013 20:20:42 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id r7EKKg8M022559; Wed, 14 Aug 2013 20:20:42 GMT (envelope-from np@svn.freebsd.org) Received: (from np@localhost) by svn.freebsd.org (8.14.7/8.14.5/Submit) id r7EKKgnT022557; Wed, 14 Aug 2013 20:20:42 GMT (envelope-from np@svn.freebsd.org) Message-Id: <201308142020.r7EKKgnT022557@svn.freebsd.org> From: Navdeep Parhar Date: Wed, 14 Aug 2013 20:20:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r254336 - user/np/cxl_tuning/sys/netinet X-SVN-Group: user MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Aug 2013 20:20:43 -0000 Author: np Date: Wed Aug 14 20:20:42 2013 New Revision: 254336 URL: http://svnweb.freebsd.org/changeset/base/254336 Log: Add a last-modified timestamp to each LRO entry and provide an interface to flush all inactive entries. Drivers decide when to flush and what the inactivity threshold should be. Network drivers that process an rx queue to completion can enter a livelock type situation when the rate at which packets are received reaches equilibrium with the rate at which the rx thread is processing them. When this happens the final LRO flush (normally when the rx routine is done) does not occur. Pure ACKs and segments with total payload < 64K can get stuck in an LRO entry. Symptoms are that TCP tx-mostly connections' performance falls off a cliff during heavy, unrelated rx on the interface. Flushing only inactive LRO entries works better than any of these alternates that I tried: - don't LRO pure ACKs - flush _all_ LRO entries periodically (every 'x' microseconds or every 'y' descriptors) - stop rx processing in the driver periodically and schedule remaining work for later. Modified: user/np/cxl_tuning/sys/netinet/tcp_lro.c user/np/cxl_tuning/sys/netinet/tcp_lro.h Modified: user/np/cxl_tuning/sys/netinet/tcp_lro.c ============================================================================== --- user/np/cxl_tuning/sys/netinet/tcp_lro.c Wed Aug 14 19:34:13 2013 (r254335) +++ user/np/cxl_tuning/sys/netinet/tcp_lro.c Wed Aug 14 20:20:42 2013 (r254336) @@ -194,6 +194,25 @@ tcp_lro_rx_csum_fixup(struct lro_entry * #endif void +tcp_lro_flush_inactive(struct lro_ctrl *lc, const struct timeval *timeout) +{ + struct lro_entry *le, *le_tmp; + struct timeval tv; + + if (SLIST_EMPTY(&lc->lro_active)) + return; + + getmicrotime(&tv); + timevalsub(&tv, timeout); + SLIST_FOREACH_SAFE(le, &lc->lro_active, next, le_tmp) { + if (timevalcmp(&tv, &le->mtime, >=)) { + SLIST_REMOVE(&lc->lro_active, le, lro_entry, next); + tcp_lro_flush(lc, le); + } + } +} + +void tcp_lro_flush(struct lro_ctrl *lc, struct lro_entry *le) { @@ -543,7 +562,8 @@ tcp_lro_rx(struct lro_ctrl *lc, struct m if (le->p_len > (65535 - lc->ifp->if_mtu)) { SLIST_REMOVE(&lc->lro_active, le, lro_entry, next); tcp_lro_flush(lc, le); - } + } else + getmicrotime(&le->mtime); return (0); } @@ -556,6 +576,7 @@ tcp_lro_rx(struct lro_ctrl *lc, struct m le = SLIST_FIRST(&lc->lro_free); SLIST_REMOVE_HEAD(&lc->lro_free, next); SLIST_INSERT_HEAD(&lc->lro_active, le, next); + getmicrotime(&le->mtime); /* Start filling in details. */ switch (eh_type) { Modified: user/np/cxl_tuning/sys/netinet/tcp_lro.h ============================================================================== --- user/np/cxl_tuning/sys/netinet/tcp_lro.h Wed Aug 14 19:34:13 2013 (r254335) +++ user/np/cxl_tuning/sys/netinet/tcp_lro.h Wed Aug 14 20:20:42 2013 (r254336) @@ -30,6 +30,8 @@ #ifndef _TCP_LRO_H_ #define _TCP_LRO_H_ +#include + struct lro_entry { SLIST_ENTRY(lro_entry) next; @@ -59,6 +61,7 @@ struct lro_entry uint32_t tsecr; uint16_t window; uint16_t timestamp; /* flag, not a TCP hdr field. */ + struct timeval mtime; }; SLIST_HEAD(lro_head, lro_entry); @@ -83,6 +86,7 @@ struct lro_ctrl { int tcp_lro_init(struct lro_ctrl *); void tcp_lro_free(struct lro_ctrl *); +void tcp_lro_flush_inactive(struct lro_ctrl *, const struct timeval *); void tcp_lro_flush(struct lro_ctrl *, struct lro_entry *); int tcp_lro_rx(struct lro_ctrl *, struct mbuf *, uint32_t);