From owner-freebsd-current Mon Feb 11 20:41:28 2002 Delivered-To: freebsd-current@freebsd.org Received: from mail12.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by hub.freebsd.org (Postfix) with ESMTP id 4C5D037B6B8 for ; Mon, 11 Feb 2002 20:13:04 -0800 (PST) Received: (qmail 5479 invoked from network); 12 Feb 2002 04:12:58 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([65.91.152.149]) (envelope-sender ) by mail12.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 12 Feb 2002 04:12:58 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: Date: Mon, 11 Feb 2002 23:12:48 -0500 (EST) From: John Baldwin To: Julian Elischer Subject: Re: ucred holding patch, BDE version Cc: Alfred Perlstein , bde@freebsd.org, current@freebsd.org Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 12-Feb-02 Julian Elischer wrote: > > > On Mon, 11 Feb 2002, John Baldwin wrote: > >> >> On 12-Feb-02 Julian Elischer wrote: >> > The proclock is needed to get the reference, >> > guarding against other threads, and giant is needed fo rnot to free it >> > because if it reaches a refcount of 0 it needs to call free(). (which john >> > assures me needs Giant at this time). >> > We could avoid the proclock with judicious use of an atomic refcount >> > incrementing method. >> >> _No_! The proc lock is protecting p_ucred, it can't go away! What _can_ go >> away is the per-ucred mutex to protect the refcount if we ever revive the >> refcount API. > > hmm ok Alfred, here's the way I see it.. This is jhb, not alfred. > You are never permitted to "CHANGE" a cred. > You ALWAYS allocate another and switch them. > When you switch them you decrement the refcount of the old one. > If someone else takes a reference on it at the same moment that you > drop it, then the order is important down to the bus-cycle > as to whether it gets freed or not. We already know that dereferencing it > from the proc structure is not important, because a "stale" ucred pointer > is only preventable from the userland. NO, no! You only know it is stale in that case cause you are comparing a non-stale known-good pointer to p_ucred. If p_ucred is stale, then it is equal to td_ucred which still points to a valid ucred so it is ok in that particular case. If p_ucred is stale but points to a new ucred, getting the proc lock around crhold() ensures that you will gain a reference to the really correct value that is guaranteed to point to a correct value. Don't think in terms of bus cycles. On non-i386 architectures, a write can sit in a store buffer waiting until it is pushed out by a memory barrier or for the CPU to get around to posting it. You can't assume that a write will be visible to other CPU's "soon". > All that is important is that the pointer is a REAL pointer to a cred > and that it is not allowed to go to 0 references in its way to 1 > reference. But if it's a stale pointer, it's not a pointer to a ucred. That memory could have been free'd and now be a mbuf header. Now you go try to increment the refcount of a mbuf header. Bad juju here. :) Trust me, my jhb_proc kernels were locking up cause I missed one instance of setting p_args to NULL during exec so taht during wait we would trash random data memory when we tried to indirect the stale p_args pointer to dec the refcount. The problem is that you actually have 2 locks involved for 2 different things here during your crhold()'s: - proc lock which protects p_ucred in two ways: - 1) it ensures that the ucred has at least one reference that doesn't go away so it's ok for us to deref the poitner and increment the count - 2) it ensures that the value of the pointer we read is up to date - ucred mutex which protects the ucred refcount Only the second mutex can be replaced by pure atomic operations in theory. The first one cannot. In your comparison test, you aren't dereferencing p_ucred, so if it is stale, that is ok due to other issues. However, when you do crhold(p->p_ucred), you dereference the pointer, so it better darn well be pointing at a ucred and not a mbuf. > An atomic reference count increment that checks > against 0 would be part of it but might not be enough. > I think we also need to have a lock on something because > it might get freed and used for something else that happens to place a "1" > in that location inbetween the time that p->p_ucred is read > and the refcount is decremented. That is what the proc lock is doing. :-P > As an asside: > I think that in NT they may have refcounts in the same location in all > structures as I think they derived all their structures from a bas class > that has ref counts. In that case it WOULD have a "1" in that location if > it were re-used. (It's been a long time since I saw NT code so I may be > wrong) That would involve great evil. Let's not do that. How do you know the refcount hasn't been adjusted on some other structure anyways that you have a stale pointer to? Maybe that current structure is just now being freed? -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message