From owner-freebsd-alpha Sat Dec 14 17:34:47 2002 Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E97D37B401; Sat, 14 Dec 2002 17:34:45 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3EE5C43E4A; Sat, 14 Dec 2002 17:34:45 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.5/8.12.5) with ESMTP id gBF1YiOM060313; Sat, 14 Dec 2002 17:34:44 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.5/8.12.5/Submit) id gBF1Yit5060312; Sat, 14 Dec 2002 17:34:44 -0800 (PST) (envelope-from dillon) Date: Sat, 14 Dec 2002 17:34:44 -0800 (PST) From: Matthew Dillon Message-Id: <200212150134.gBF1Yit5060312@apollo.backplane.com> To: "Brian F. Feldman" Cc: Jake Burkholder , "Brian F. Feldman" , John Baldwin , Kris Kennaway , current@FreeBSD.ORG, alpha@FreeBSD.ORG Subject: Re: UMA panic under load References: <200212150121.gBF1L15m014304@green.bikeshed.org> Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org It's a big mess. exit1() sets up vm->vm_freer = p and then vmspace_exitfree() tests that and calls vmspace_dofree(). It looks like vm->vm_freer is acting like an exit-lock, so only one process/thread actually frees the vmspace. But there are still some serious race conditions. If two thread go into exit1() at the same time, but vmspace_exitfree() is called in the reverse order, so the first call to vmspace_exitfree() winds up freeing the vmspace, the first process's vmspace might be ripped out from under it. On the flip side if several threads go into exit1() at the same time the vmspace's ref count may never be seen to be '0' if we move the decrement to later on in the code. So my 'what if we did this' patch will fix one problem and create another. The reference count must be decremented where it is currently being decremented in exit1() or there is a chance that multiple exit1()'s will not see the ref count drop to 0 (or be equal to 1). On the flip side (again), vmspace_exitfree() really should not call vmspace_dofree() unless it is the last process, which is not necessarily the same process that detected the ref count going to 0 in exit1(). It's like we need a second ref count field for the vmspace structure, one to determine when the initial bunch of garbage can be freed up (sysV shared memory and usch), and another to determine when vmspace_dofree() can actually be called. -Matt :There are no normal reference count semantics; exit1() attempts to free :parts of the vmspace. Sounds to me like a simple solution is to check for :P_WEXIT both before and after incrementing the vmspace refcount. : :-- :Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ : <> green@FreeBSD.org <> bfeldman@tislabs.com \ The Power to Serve! \ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message