From owner-cvs-all@FreeBSD.ORG Wed Jun 9 16:18:26 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 986CD16A4CE; Wed, 9 Jun 2004 16:18:26 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8F16B43D2D; Wed, 9 Jun 2004 16:18:26 +0000 (GMT) (envelope-from bmilekic@FreeBSD.org) Received: from freefall.freebsd.org (bmilekic@localhost [127.0.0.1]) i59GIIuD025497; Wed, 9 Jun 2004 16:18:18 GMT (envelope-from bmilekic@freefall.freebsd.org) Received: (from bmilekic@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i59GII2V025496; Wed, 9 Jun 2004 16:18:18 GMT (envelope-from bmilekic) Date: Wed, 9 Jun 2004 16:18:18 +0000 From: Bosko Milekic To: "M. Warner Losh" Message-ID: <20040609161818.GA25348@freefall.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i cc: cvs-src@FreeBSD.org cc: phk@phk.freebsd.dk cc: src-committers@FreeBSD.org cc: cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern kern_proc.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jun 2004 16:18:26 -0000 Warner wrote: >Poul-Henning wrote: >: Not to pick on anybody, but this is a perfect example of getting locking >: almost right: >: >: BAD: >: >: LOCK(foo->lock) >: foo->refcount--; >: UNLOCK(foo->lock) >: if (foo->refcount == 0) >: destroy(foo); >: >: GOOD: >: >: LOCK(foo->lock) >: i = --foo->refcount; >: UNLOCK(foo->lock) >: if (i == 0) >: destroy(foo); >: > >Can you provide a couple of lines about why BAD is BAD and why GOOD >fixes that flaw? That should help others from making this mistake in >the future. > >Warner Frankly, I think it is obvious. In the BAD case, you decrement safely but you compare for the refcount having hit zero without the lock protecting the count being taken, so what happens is that there is a race window between where you drop the lock and check for the zero refcount where another racing thread could drop the refcount finally to zero and go into the check as well. Briefly, this can happen: - refcount of 'foo' is 2. - thread 1 enters BAD code, lowers refcount to 1, releases lock. - thread 2 enters BAD code, lowers refcount to 0, releases lock. - thread 1 checks against refcount being zero, decides it is now, and proceeds to destroy(foo). - thread 2 checks against refcount being zero, decides it is now, and proceeds to destroy(foo) as well. Conclusion: foo is destroyed twice. The GOOD code does not suffer from this problem. Here is a way to handle this sort of race if your reference counter is instead manipulated atomically (as opposed to protected by a mutex): [From Mbuf-related code] MEXT_REM_REF(m); /* Atomic decrement of m->m_ext.ref_cnt */ if (atomic_cmpset_int(m->m_ext.ref_cnt, 0, 1)) { /* Do the free here... */ } return; -Bosko