From owner-freebsd-arch@FreeBSD.ORG Sun Oct 10 19:50:12 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7EE9E16A4CE for ; Sun, 10 Oct 2004 19:50:12 +0000 (GMT) Received: from duchess.speedfactory.net (duchess.speedfactory.net [66.23.201.84]) by mx1.FreeBSD.org (Postfix) with SMTP id 1685043D1F for ; Sun, 10 Oct 2004 19:50:10 +0000 (GMT) (envelope-from ups@tree.com) Received: (qmail 31697 invoked by uid 89); 10 Oct 2004 19:50:09 -0000 Received: from duchess.speedfactory.net (66.23.201.84) by duchess.speedfactory.net with SMTP; 10 Oct 2004 19:50:09 -0000 Received: (qmail 31685 invoked by uid 89); 10 Oct 2004 19:50:09 -0000 Received: from unknown (HELO palm.tree.com) (66.23.216.49) by duchess.speedfactory.net with SMTP; 10 Oct 2004 19:50:09 -0000 Received: from [127.0.0.1] (localhost.tree.com [127.0.0.1]) by palm.tree.com (8.12.10/8.12.10) with ESMTP id i9AJo8mt080405; Sun, 10 Oct 2004 15:50:08 -0400 (EDT) (envelope-from ups@tree.com) From: Stephan Uphoff To: Peter Holm In-Reply-To: <20041005130308.GA2586@peter.osted.lan> References: <1095468747.31297.241.camel@palm.tree.com> <1096496057.3733.2163.camel@palm.tree.com> <1096603981.21577.195.camel@palm.tree.com> <200410041131.35387.jhb@FreeBSD.org> <1096911278.44307.17.camel@palm.tree.com> <41619D29.1000704@elischer.org><4161A7BD.3040706@elischer.org> <20041005130308.GA2586@peter.osted.lan> Content-Type: text/plain Message-Id: <1097437808.80398.4.camel@palm.tree.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Sun, 10 Oct 2004 15:50:08 -0400 Content-Transfer-Encoding: 7bit cc: Julian Elischer cc: "freebsd-arch@freebsd.org" Subject: Re: scheduler (sched_4bsd) questions X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Oct 2004 19:50:12 -0000 On Tue, 2004-10-05 at 09:03, Peter Holm wrote: > On Mon, Oct 04, 2004 at 12:42:53PM -0700, Julian Elischer wrote: > > OK, I got a crash dump now, after a few modifications to kern_shutdown.c > > There are however a few strange things worth noticing: > > 1) The are no panic string: > > Mounted root from ufs:/dev/ad0s1a. > pid 1146: corrected slot count (2->1) > [thread 100796] > Stopped at sched_add+0x13: movl 0x14c(%esi),%ebx > > 2) The gdb stack trace gets a bit weird at: > > #8 0xc07812da in calltrap () at ../../../i386/i386/exception.s:140 > #9 0xc05f0018 in flock (td=0x0, uap=0x0) at ../../../kern/kern_descrip.c:2138 > #10 0xc0619fd1 in setrunqueue (td=0xc2319180, flags=0x0) at kern_switch.c:521 > #11 0xc061921f in sched_wakeup (td=0xc2319180) at ../../../kern/sched_4bsd.c:859 > > Where did flock() come from? > > The full console output is at http://www.holm.cc/stress/log/cons82.html > > - Peter I am still puzzled. My newest pet theory is that the sorting of the kg_runq is corrupted before setrunqueue is called. Directly changing td_priority while the thread is on the run queue would be an explanation. However the only instance that I found is what I think is a rare condition where sleepq_resume_thread may be called while the thread is on a runqueue. (John - what did I miss this time ...) Peter could you try this patch? Index: subr_sleepqueue.c =================================================================== RCS file: /cvsroot/src/sys/kern/subr_sleepqueue.c,v retrieving revision 1.11 diff -u -r1.11 subr_sleepqueue.c --- subr_sleepqueue.c 19 Aug 2004 11:31:41 -0000 1.11 +++ subr_sleepqueue.c 10 Oct 2004 18:18:55 -0000 @@ -642,7 +642,7 @@ /* Adjust priority if requested. */ MPASS(pri == -1 || (pri >= PRI_MIN && pri <= PRI_MAX)); if (pri != -1 && td->td_priority > pri) - td->td_priority = pri; + sched_prio(td, pri); setrunnable(td); mtx_unlock_spin(&sched_lock); } Should it crash again could you walk the kg_runq to verify the sorting? Thanks Stephan