From owner-svn-src-stable-7@FreeBSD.ORG Tue Nov 22 01:32:58 2011 Return-Path: Delivered-To: svn-src-stable-7@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A4FB1065672; Tue, 22 Nov 2011 01:32:58 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 301198FC13; Tue, 22 Nov 2011 01:32:58 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id pAM1Ww5G068375; Tue, 22 Nov 2011 01:32:58 GMT (envelope-from rmacklem@svn.freebsd.org) Received: (from rmacklem@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id pAM1Ww05068373; Tue, 22 Nov 2011 01:32:58 GMT (envelope-from rmacklem@svn.freebsd.org) Message-Id: <201111220132.pAM1Ww05068373@svn.freebsd.org> From: Rick Macklem Date: Tue, 22 Nov 2011 01:32:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-7@freebsd.org X-SVN-Group: stable-7 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r227810 - stable/7/sys/rpc X-BeenThere: svn-src-stable-7@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for only the 7-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Nov 2011 01:32:58 -0000 Author: rmacklem Date: Tue Nov 22 01:32:57 2011 New Revision: 227810 URL: http://svn.freebsd.org/changeset/base/227810 Log: MFC: r227059 Both a crash reported on freebsd-current on Oct. 18 under the subject heading "mtx_lock() of destroyed mutex on NFS" and PR# 156168 appear to be caused by clnt_dg_destroy() closing down the socket prematurely. When to close down the socket is controlled by a reference count (cs_refs), but clnt_dg_create() checks for sb_upcall being non-NULL to decide if a new socket is needed. I believe the crashes were caused by the following race: clnt_dg_destroy() finds cs_refs == 0 and decides to delete socket clnt_dg_destroy() then loses race with clnt_dg_create() for acquisition of the SOCKBUF_LOCK() clnt_dg_create() finds sb_upcall != NULL and increments cs_refs to 1 clnt_dg_destroy() then acquires SOCKBUF_LOCK(), sets sb_upcall to NULL and destroys socket This patch fixes the above race by changing clnt_dg_destroy() so that it acquires SOCKBUF_LOCK() before testing cs_refs. This is a slightly modified patch for stable/7. It fixes the above race, although others still exist, since some patches such as r193272 cannot be MFC'd. Tested by: nonesuch at longcount.org (Mark Saad) PR: kern/156168 Modified: stable/7/sys/rpc/clnt_dg.c Directory Properties: stable/7/sys/ (props changed) stable/7/sys/cddl/contrib/opensolaris/ (props changed) stable/7/sys/contrib/dev/acpica/ (props changed) stable/7/sys/contrib/pf/ (props changed) Modified: stable/7/sys/rpc/clnt_dg.c ============================================================================== --- stable/7/sys/rpc/clnt_dg.c Tue Nov 22 00:35:30 2011 (r227809) +++ stable/7/sys/rpc/clnt_dg.c Tue Nov 22 01:32:57 2011 (r227810) @@ -811,18 +811,22 @@ clnt_dg_destroy(CLIENT *cl) while (cu->cu_threads) msleep(cu, &cs->cs_lock, 0, "rpcclose", 0); + mtx_unlock(&cs->cs_lock); + SOCKBUF_LOCK(&cu->cu_socket->so_rcv); + mtx_lock(&cs->cs_lock); cs->cs_refs--; if (cs->cs_refs == 0) { - mtx_destroy(&cs->cs_lock); - SOCKBUF_LOCK(&cu->cu_socket->so_rcv); + mtx_unlock(&cs->cs_lock); cu->cu_socket->so_upcallarg = NULL; cu->cu_socket->so_upcall = NULL; cu->cu_socket->so_rcv.sb_flags &= ~SB_UPCALL; SOCKBUF_UNLOCK(&cu->cu_socket->so_rcv); + mtx_destroy(&cs->cs_lock); mem_free(cs, sizeof(*cs)); lastsocketref = TRUE; } else { mtx_unlock(&cs->cs_lock); + SOCKBUF_UNLOCK(&cu->cu_socket->so_rcv); lastsocketref = FALSE; }