From owner-freebsd-bugs@FreeBSD.ORG Sat Oct 13 22:00:00 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E36586CA for ; Sat, 13 Oct 2012 22:00:00 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id B9D5B8FC0A for ; Sat, 13 Oct 2012 22:00:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9DM00Wq088208 for ; Sat, 13 Oct 2012 22:00:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9DM00q6088207; Sat, 13 Oct 2012 22:00:00 GMT (envelope-from gnats) Resent-Date: Sat, 13 Oct 2012 22:00:00 GMT Resent-Message-Id: <201210132200.q9DM00q6088207@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Jason Wolfe Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2856F677 for ; Sat, 13 Oct 2012 21:52:09 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 1065F8FC0C for ; Sat, 13 Oct 2012 21:52:09 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.5/8.14.5) with ESMTP id q9DLq8T5000122 for ; Sat, 13 Oct 2012 21:52:08 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.5/8.14.5/Submit) id q9DLq8I5000120; Sat, 13 Oct 2012 21:52:08 GMT (envelope-from nobody) Message-Id: <201210132152.q9DLq8I5000120@red.freebsd.org> Date: Sat, 13 Oct 2012 21:52:08 GMT From: Jason Wolfe To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Subject: misc/172675: sysctl_tcp_hc_list (net.inet.tcp.hostcache.list) race condition causing memory corruption X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Oct 2012 22:00:01 -0000 >Number: 172675 >Category: misc >Synopsis: sysctl_tcp_hc_list (net.inet.tcp.hostcache.list) race condition causing memory corruption >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Oct 13 22:00:00 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Jason Wolfe >Release: 8.3-RELEASE-p4 >Organization: Limelight Networks >Environment: FreeBSD xxx 8.3-RELEASE-p4 FreeBSD 8.3-RELEASE-p4 #0: Mon Aug 20 18:41:21 MST 2012 jason@xxx:/usr/obj/usr/src/sys/SIXFOUR amd64 >Description: It appears that in sysctl_tcp_hc_list, the function used for 'sysctl net.inet.tcp.hostcache.list', there is a somewhat rare chance of cache_count being lower than the actual number of entries. Not sure if this is happening by a race condition after execution time, or if the value just isn't updated constantly enough. On heavily loaded boxes (1000+ more or less unique req/s) it's not too tough to cause memory corruption and a panic running 'sysctl net.inet.tcp.hostcache.list' with the current code. A colleague of mine had spotted the issue, and believes this patch would do the trick. I've been testing it by simply running the hostcache.list in a loop, and have had success where prior it would have caused a panic in fairly short order. Patch by: Nils McCarthy >How-To-Repeat: Run sysctl net.inet.tcp.hostcache.list continuously. >Fix: Possible patch supplied Patch attached with submission follows: --- src.stock/sys/netinet/tcp_hostcache.c 2012-03-02 23:15:13.000000000 -0700 +++ src/sys/netinet/tcp_hostcache.c 2012-10-12 19:54:55.000000000 -0700 @@ -360,7 +360,7 @@ } TAILQ_REMOVE(&hc_head->hch_bucket, hc_entry, rmx_q); V_tcp_hostcache.hashbase[hash].hch_length--; - V_tcp_hostcache.cache_count--; + atomic_subtract_int(&V_tcp_hostcache.cache_count, 1); TCPSTAT_INC(tcps_hc_bucketoverflow); #if 0 uma_zfree(V_tcp_hostcache.zone, hc_entry); @@ -392,7 +392,7 @@ */ TAILQ_INSERT_HEAD(&hc_head->hch_bucket, hc_entry, rmx_q); V_tcp_hostcache.hashbase[hash].hch_length++; - V_tcp_hostcache.cache_count++; + atomic_add_int(&V_tcp_hostcache.cache_count, 1); TCPSTAT_INC(tcps_hc_added); return hc_entry; @@ -587,6 +587,7 @@ static int sysctl_tcp_hc_list(SYSCTL_HANDLER_ARGS) { + int alloclines; int bufsize; int linesize = 128; char *p, *buf; @@ -595,8 +596,10 @@ #ifdef INET6 char ip6buf[INET6_ADDRSTRLEN]; #endif - - bufsize = linesize * (V_tcp_hostcache.cache_count + 1); + + /* overallocate in case the host cache size changes while we're reading it */ + alloclines = V_tcp_hostcache.cache_count*3/2; + bufsize = linesize * (alloclines + 1); p = buf = (char *)malloc(bufsize, M_TEMP, M_WAITOK|M_ZERO); @@ -606,7 +609,7 @@ p += len; #define msec(u) (((u) + 500) / 1000) - for (i = 0; i < V_tcp_hostcache.hashsize; i++) { + for (i = 0; i < V_tcp_hostcache.hashsize && alloclines > 0; i++) { THC_LOCK(&V_tcp_hostcache.hashbase[i].hch_mtx); TAILQ_FOREACH(hc_entry, &V_tcp_hostcache.hashbase[i].hch_bucket, rmx_q) { @@ -624,7 +627,7 @@ msec(hc_entry->rmx_rtt * (RTM_RTTUNIT / (hz * TCP_RTT_SCALE))), msec(hc_entry->rmx_rttvar * - (RTM_RTTUNIT / (hz * TCP_RTT_SCALE))), + (RTM_RTTUNIT / (hz * TCP_RTTVAR_SCALE))), hc_entry->rmx_bandwidth * 8, hc_entry->rmx_cwnd, hc_entry->rmx_sendpipe, @@ -633,6 +636,10 @@ hc_entry->rmx_updates, hc_entry->rmx_expire); p += len; + alloclines--; + if (alloclines == 0) { + break; + } } THC_UNLOCK(&V_tcp_hostcache.hashbase[i].hch_mtx); } @@ -660,7 +667,7 @@ hc_entry, rmx_q); uma_zfree(V_tcp_hostcache.zone, hc_entry); V_tcp_hostcache.hashbase[i].hch_length--; - V_tcp_hostcache.cache_count--; + atomic_subtract_int(&V_tcp_hostcache.cache_count, 1); } else hc_entry->rmx_expire -= V_tcp_hostcache.prune; } >Release-Note: >Audit-Trail: >Unformatted: