From owner-freebsd-current Tue Jul 23 16:48:59 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 31A3C37B400; Tue, 23 Jul 2002 16:48:54 -0700 (PDT) Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id A453A43E31; Tue, 23 Jul 2002 16:45:19 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc02.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020723234508.FEAG19639.sccrmhc02.attbi.com@InterJet.elischer.org>; Tue, 23 Jul 2002 23:45:08 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id PAA07308; Tue, 23 Jul 2002 15:13:11 -0700 (PDT) Date: Tue, 23 Jul 2002 15:13:09 -0700 (PDT) From: Julian Elischer To: Peter Wemm Cc: Yann Berthier , current@freebsd.org, alfred@freebsd.org Subject: Re: Is it just me or has -current suddenly got massively unstable? In-Reply-To: <20020723070704.7B4CB3925@overcee.wemm.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 23 Jul 2002, Peter Wemm wrote: > > thread_zone = uma_zcreate("THREAD", sizeof (struct thread), > thread_ctor, thread_dtor, thread_init, thread_fini, > - UMA_ALIGN_CACHE, 0); > + UMA_ALIGN_CACHE, UMA_ZONE_NOFREE); > } > > /* > > I haven't paniced yet with that change. :-) For some unknown reason, > selwakeup() is dereferencing pointers to threads that have long gone and > the backing store has been freed. The patch above is a bandaid, not a > solution. It basically prevents threads ever being freed back to the > general pool, even though everything here supposedly does not need that. > (unlike struct proc and socket, for example). Peter.. this comment in selrecord scared the heck out of me.. --- /* 1151 * If the thread is NULL then take ownership of selinfo 1152 * however if the thread is not NULL and the thread points to 1153 * someone else, then we have a collision, otherwise leave it alone 1154 * as we've owned it in a previous selrecord on this selinfo. 1155 */ ------- it suggests that select still doesn't clean up after itself. looking in select() however I see: 836 if (timo > 0) 837 error = cv_timedwait_sig(&selwait, &sellock, timo); 838 else 839 error = cv_wait_sig(&selwait, &sellock); 840 841 if (error == 0) 842 goto retry; 843 844 done: 845 clear_selinfo_list(td); This suggests that there is no way to exit this function without clearing the thread pointers but your trace suggests otherwise.. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message