From owner-cvs-all@FreeBSD.ORG Wed Jun 9 18:19:44 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4695C16A4CE; Wed, 9 Jun 2004 18:19:44 +0000 (GMT) Received: from harmony.village.org (rover.village.org [168.103.84.182]) by mx1.FreeBSD.org (Postfix) with ESMTP id DA49943D1F; Wed, 9 Jun 2004 18:19:43 +0000 (GMT) (envelope-from imp@bsdimp.com) Received: from localhost (warner@rover2.village.org [10.0.0.1]) by harmony.village.org (8.12.11/8.12.11) with ESMTP id i59IGPEu045725; Wed, 9 Jun 2004 12:16:25 -0600 (MDT) (envelope-from imp@bsdimp.com) Date: Wed, 09 Jun 2004 12:16:40 -0600 (MDT) Message-Id: <20040609.121640.104033033.imp@bsdimp.com> To: bmilekic@FreeBSD.org From: "M. Warner Losh" In-Reply-To: <20040609161818.GA25348@freefall.freebsd.org> References: <20040609161818.GA25348@freefall.freebsd.org> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit cc: cvs-src@FreeBSD.org cc: phk@phk.freebsd.dk cc: src-committers@FreeBSD.org cc: cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern kern_proc.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jun 2004 18:19:44 -0000 In message: <20040609161818.GA25348@freefall.freebsd.org> Bosko Milekic writes: : Frankly, I think it is obvious. If it were obvious, then people wouldn't be making this mistake. : In the BAD case, you decrement safely but you compare for the refcount : having hit zero without the lock protecting the count being taken, : so what happens is that there is a race window between where you drop : the lock and check for the zero refcount where another racing thread : could drop the refcount finally to zero and go into the check as : well. Briefly, this can happen: : : - refcount of 'foo' is 2. : - thread 1 enters BAD code, lowers refcount to 1, releases lock. : - thread 2 enters BAD code, lowers refcount to 0, releases lock. : - thread 1 checks against refcount being zero, decides it is now, : and proceeds to destroy(foo). : - thread 2 checks against refcount being zero, decides it is now, : and proceeds to destroy(foo) as well. : : Conclusion: foo is destroyed twice. Yes. That's correct. I understood that, but only after I sent the message and thought about it for a while. That tends to be my metric for 'needs a comment' or 'needs to be documented.' : The GOOD code does not suffer from this problem. Here is a way to : handle this sort of race if your reference counter is instead : manipulated atomically (as opposed to protected by a mutex): : [From Mbuf-related code] : : MEXT_REM_REF(m); /* Atomic decrement of m->m_ext.ref_cnt */ : if (atomic_cmpset_int(m->m_ext.ref_cnt, 0, 1)) { : /* Do the free here... */ : } : return; Maybe we need a locking section in the handbook that talks about these techniques. The more people that know and understand them, the better the locking that will happen in the rest of the kernel. Warner