Date: Sat, 13 Oct 2012 21:52:08 GMT From: Jason Wolfe <nitroboost@gmail.com> To: freebsd-gnats-submit@FreeBSD.org Subject: misc/172675: sysctl_tcp_hc_list (net.inet.tcp.hostcache.list) race condition causing memory corruption Message-ID: <201210132152.q9DLq8I5000120@red.freebsd.org> Resent-Message-ID: <201210132200.q9DM00q6088207@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 172675
>Category: misc
>Synopsis: sysctl_tcp_hc_list (net.inet.tcp.hostcache.list) race condition causing memory corruption
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Oct 13 22:00:00 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator: Jason Wolfe
>Release: 8.3-RELEASE-p4
>Organization:
Limelight Networks
>Environment:
FreeBSD xxx 8.3-RELEASE-p4 FreeBSD 8.3-RELEASE-p4 #0: Mon Aug 20 18:41:21 MST 2012 jason@xxx:/usr/obj/usr/src/sys/SIXFOUR amd64
>Description:
It appears that in sysctl_tcp_hc_list, the function used for 'sysctl net.inet.tcp.hostcache.list', there is a somewhat rare chance of cache_count being lower than the actual number of entries. Not sure if this is happening by a race condition after execution time, or if the value just isn't updated constantly enough. On heavily loaded boxes (1000+ more or less unique req/s) it's not too tough to cause memory corruption and a panic running 'sysctl net.inet.tcp.hostcache.list' with the current code.
A colleague of mine had spotted the issue, and believes this patch would do the trick. I've been testing it by simply running the hostcache.list in a loop, and have had success where prior it would have caused a panic in fairly short order.
Patch by: Nils McCarthy <nils@shkoo.com>
>How-To-Repeat:
Run sysctl net.inet.tcp.hostcache.list continuously.
>Fix:
Possible patch supplied
Patch attached with submission follows:
--- src.stock/sys/netinet/tcp_hostcache.c 2012-03-02 23:15:13.000000000 -0700
+++ src/sys/netinet/tcp_hostcache.c 2012-10-12 19:54:55.000000000 -0700
@@ -360,7 +360,7 @@
}
TAILQ_REMOVE(&hc_head->hch_bucket, hc_entry, rmx_q);
V_tcp_hostcache.hashbase[hash].hch_length--;
- V_tcp_hostcache.cache_count--;
+ atomic_subtract_int(&V_tcp_hostcache.cache_count, 1);
TCPSTAT_INC(tcps_hc_bucketoverflow);
#if 0
uma_zfree(V_tcp_hostcache.zone, hc_entry);
@@ -392,7 +392,7 @@
*/
TAILQ_INSERT_HEAD(&hc_head->hch_bucket, hc_entry, rmx_q);
V_tcp_hostcache.hashbase[hash].hch_length++;
- V_tcp_hostcache.cache_count++;
+ atomic_add_int(&V_tcp_hostcache.cache_count, 1);
TCPSTAT_INC(tcps_hc_added);
return hc_entry;
@@ -587,6 +587,7 @@
static int
sysctl_tcp_hc_list(SYSCTL_HANDLER_ARGS)
{
+ int alloclines;
int bufsize;
int linesize = 128;
char *p, *buf;
@@ -595,8 +596,10 @@
#ifdef INET6
char ip6buf[INET6_ADDRSTRLEN];
#endif
-
- bufsize = linesize * (V_tcp_hostcache.cache_count + 1);
+
+ /* overallocate in case the host cache size changes while we're reading it */
+ alloclines = V_tcp_hostcache.cache_count*3/2;
+ bufsize = linesize * (alloclines + 1);
p = buf = (char *)malloc(bufsize, M_TEMP, M_WAITOK|M_ZERO);
@@ -606,7 +609,7 @@
p += len;
#define msec(u) (((u) + 500) / 1000)
- for (i = 0; i < V_tcp_hostcache.hashsize; i++) {
+ for (i = 0; i < V_tcp_hostcache.hashsize && alloclines > 0; i++) {
THC_LOCK(&V_tcp_hostcache.hashbase[i].hch_mtx);
TAILQ_FOREACH(hc_entry, &V_tcp_hostcache.hashbase[i].hch_bucket,
rmx_q) {
@@ -624,7 +627,7 @@
msec(hc_entry->rmx_rtt *
(RTM_RTTUNIT / (hz * TCP_RTT_SCALE))),
msec(hc_entry->rmx_rttvar *
- (RTM_RTTUNIT / (hz * TCP_RTT_SCALE))),
+ (RTM_RTTUNIT / (hz * TCP_RTTVAR_SCALE))),
hc_entry->rmx_bandwidth * 8,
hc_entry->rmx_cwnd,
hc_entry->rmx_sendpipe,
@@ -633,6 +636,10 @@
hc_entry->rmx_updates,
hc_entry->rmx_expire);
p += len;
+ alloclines--;
+ if (alloclines == 0) {
+ break;
+ }
}
THC_UNLOCK(&V_tcp_hostcache.hashbase[i].hch_mtx);
}
@@ -660,7 +667,7 @@
hc_entry, rmx_q);
uma_zfree(V_tcp_hostcache.zone, hc_entry);
V_tcp_hostcache.hashbase[i].hch_length--;
- V_tcp_hostcache.cache_count--;
+ atomic_subtract_int(&V_tcp_hostcache.cache_count, 1);
} else
hc_entry->rmx_expire -= V_tcp_hostcache.prune;
}
>Release-Note:
>Audit-Trail:
>Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201210132152.q9DLq8I5000120>
