From owner-freebsd-stable@FreeBSD.ORG Fri May 27 11:47:29 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5142616A41C; Fri, 27 May 2005 11:47:29 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1238743D4C; Fri, 27 May 2005 11:47:28 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id 2B50446B04; Fri, 27 May 2005 07:47:28 -0400 (EDT) Date: Fri, 27 May 2005 12:47:28 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Marc Olzheim In-Reply-To: <20050510131005.GA4083@stack.nl> Message-ID: <20050527124531.U727@fledge.watson.org> References: <20050503150014.GG17096@stack.nl> <20050510131005.GA4083@stack.nl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-stable@freebsd.org, bug-followup@FreeBSD.org Subject: Re: kern/78824: race condition close()ing and read()ing the same socketpair on SMP. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 May 2005 11:47:29 -0000 On Tue, 10 May 2005, Marc Olzheim wrote: > On Tue, May 03, 2005 at 05:00:14PM +0200, Marc Olzheim wrote: >> Is this going to be fixed before 5.4 ? It still breaks on today's >> 5.4-STABLE. > > As this is the only issue known to me now, that I don't have a patch for > and is standing in my way of upgrading from FreeBSD 4.x to 5.x, I would > like to know whether this is a simple bug that could be fixed in a > second or not... If there are any issues (like being able to reproduce > it) or not, please let me know where I can be of assistance. Hmm. I'm unable to reproduce this on local SMP hardware, although I can see at least one way that the race could occur. Could you try the attached patch and see if it helps matters? This is a slight shot in the dark but closes at least two races in the transition of socket state with respect to socket buffer state. Robert N M Watson Index: uipc_socket2.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v retrieving revision 1.145 diff -u -r1.145 uipc_socket2.c --- uipc_socket2.c 12 Mar 2005 13:39:39 -0000 1.145 +++ uipc_socket2.c 27 May 2005 11:34:03 -0000 @@ -159,15 +159,12 @@ { /* - * XXXRW: This code separately acquires SOCK_LOCK(so) and - * SOCKBUF_LOCK(&so->so_rcv) even though they are the same mutex to - * avoid introducing the assumption that they are the same. + * XXXRW: This code assumes that SOCK_LOCK(so) and + * SOCKBUF_LOCK(&so->so_rcv) are the same. */ - SOCK_LOCK(so); + SOCKBUF_LOCK(&so->so_rcv); so->so_state &= ~SS_ISCONNECTING; so->so_state |= SS_ISDISCONNECTING; - SOCK_UNLOCK(so); - SOCKBUF_LOCK(&so->so_rcv); so->so_rcv.sb_state |= SBS_CANTRCVMORE; sorwakeup_locked(so); SOCKBUF_LOCK(&so->so_snd); @@ -182,16 +179,12 @@ { /* - * XXXRW: This code separately acquires SOCK_LOCK(so) and - * SOCKBUF_LOCK(&so->so_rcv) even though they are the same mutex to - * avoid introducing the assumption that they are the same. + * XXXRW: This code assumes that SOCK_LOCK(so) and + * SOCKBUF_LOCK(&so->so_rcv) are the same. */ - /* XXXRW: so_state locking? */ SOCK_LOCK(so); so->so_state &= ~(SS_ISCONNECTING|SS_ISCONNECTED|SS_ISDISCONNECTING); so->so_state |= SS_ISDISCONNECTED; - SOCK_UNLOCK(so); - SOCKBUF_LOCK(&so->so_rcv); so->so_rcv.sb_state |= SBS_CANTRCVMORE; sorwakeup_locked(so); SOCKBUF_LOCK(&so->so_snd);