From owner-freebsd-arch Wed Apr 18 11:21: 3 2001 Delivered-To: freebsd-arch@freebsd.org Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133]) by hub.freebsd.org (Postfix) with ESMTP id 1B5BE37B423 for ; Wed, 18 Apr 2001 11:20:59 -0700 (PDT) (envelope-from tlambert@usr02.primenet.com) Received: (from daemon@localhost) by smtp03.primenet.com (8.9.3/8.9.3) id LAA02572 for ; Wed, 18 Apr 2001 11:20:57 -0700 (MST) Received: from usr02.primenet.com(206.165.6.202) via SMTP by smtp03.primenet.com, id smtpdAAAjwaiaf; Wed Apr 18 11:20:51 2001 Received: (from tlambert@localhost) by usr02.primenet.com (8.8.5/8.8.5) id LAA17706 for arch@freebsd.org; Wed, 18 Apr 2001 11:26:23 -0700 (MST) From: Terry Lambert Message-Id: <200104181826.LAA17706@usr02.primenet.com> Subject: Reference counters To: arch@freebsd.org Date: Wed, 18 Apr 2001 18:26:23 +0000 (GMT) X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG As some people undoubtedly know, I've been fighting a bug with credential reference counts, which appears to be related to a close race under extreme load. The architecture of the reference counting is extremely incompatible with tracking this bug down. I would like to propose that the reference counting be substantially changed, and that a general philosophy be adopted in place of the current philosophy. First, some minor discussion. -- Here code from kern/kern_descrip.c: fsetown() crhold(curproc->p_ucred); sigio->sio_ucred = curproc->p_ucred; falloc() fp->f_cred = p->p_ucred; fp->f_ops = &badfileops; fp->f_seqcount = 1; crhold(fp->f_cred); kern/uipc_socket2.c: sonewconn3() so->so_cred = p ? p->p_ucred : head->so_cred; crhold(so->so_cred); kern/uipc_socket.c: socreate() so->so_cred = p->p_ucred; crhold(so->so_cred); kern/kern_fork.c: fork1() MALLOC(p2->p_cred, struct pcred *, sizeof(struct pcred), M_SUBPROC, M_WAITOK); bcopy(p1->p_cred, p2->p_cred, sizeof(*p2->p_cred)); p2->p_cred->p_refcnt = 1; crhold(p1->p_ucred); uihold(p1->p_cred->p_uidinfo); -------------- Except for this last code (which is amazingly distressing to me; it's probably going to turn out to be the ultimate source of the problem I'm seeing), this code lends itself to cleanup; specifically: fsetown() sigio->sio_ucred = crhold(curproc->p_ucred); falloc() fp->f_ops = &badfileops; fp->f_seqcount = 1; fp->f_cred = crhold(p->p_ucred); sonewconn3() so->so_cred = p ? crhold(p->p_ucred) : crhold(head->so_cred); etc.. This would let the crhold() function be replaced with a resource tracking macro version. This would let me identify the instance of the reference allocation which is being doubly freed somewhere in my list of 30,000 descriptors, and -- more important -- with instrumentation, permit me to identify the origin allocation. Specifically, I can relace crhold() with crdup(), and I get the actual double free that's the problem, instead of the last free). I further siggest that _ALL_ reference holds in FreeBSD receive similar treatment. Discussion? Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message