From owner-freebsd-current@FreeBSD.ORG Fri Jun 6 09:39:49 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 82D3237B404 for ; Fri, 6 Jun 2003 09:39:49 -0700 (PDT) Received: from mail.speakeasy.net (mail13.speakeasy.net [216.254.0.213]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6ACC443FB1 for ; Fri, 6 Jun 2003 09:39:48 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 21024 invoked from network); 6 Jun 2003 16:39:47 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 6 Jun 2003 16:39:47 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h56Gdjp0014086; Fri, 6 Jun 2003 12:39:45 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20030606081040.GA65780@tombstone.localnet.gomerbud.com> Date: Fri, 06 Jun 2003 12:39:46 -0400 (EDT) From: John Baldwin To: "David P. Reese Jr." cc: jeffr@FreeBSD.org cc: current@freebsd.org Subject: RE: LOR: sched lock vs. sio + panic in sched_choose() [ULE + SMP panic] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jun 2003 16:39:49 -0000 On 06-Jun-2003 David P. Reese Jr. wrote: > I've been getting a lot of these for the last two weeks on my SMP box. > This panic is on -CURRENT from earlier today. Scheduler is ULE. > > lock order reversal > 1st 0xc047f820 sched lock (sched lock) @ /usr/src/sys/kern/kern_intr.c:548 > 2nd 0xc04b83c0 sio (sio) @ /usr/src/sys/dev/sio/sio.c:3242 This is a duplicate panic because you are using a serial console. > Stack backtrace: > backtrace(c0400378,c04b83c0,c0463120,c0463120,c041266b) at backtrace+0x17 > witness_lock(c04b83c0,8,c041266b,caa,c11efc00) at witness_lock+0x697 > _mtx_lock_spin_flags(c04b83c0,0,c041266b,caa,0) at _mtx_lock_spin_flags+0xd1 > siocnputc(c0463280,d,5,d1d62b68,0) at siocnputc+0x81 > cnputc(a,ffffffff,1,c0415c53,c) at cnputc+0x56 > putchar(a,d1d62b68,d1d62ab4,c0491d40,0) at putchar+0xcd > kvprintf(c0415c52,c025eba0,d1d62b68,a,d1d62b88) at kvprintf+0x7d > printf(c0415c52,c,c0415a4d,c03fe55b,c0489b20) at printf+0x57 This is the real panic below: > trap_fatal(d1d62c14,38,d1d62bf0,c0236c9d,38) at trap_fatal+0x76 > trap(d1d60018,c0240010,c0470010,c11dcbe0,c0482280) at trap+0x123 > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0xc0253ec7, esp = 0xd1d62c54, ebp = 0xd1d62c68 --- > sched_choose(c11dee40,c03fe7a6,28c,0,c11db668) at sched_choose+0x77 > choosethread(c11dcbe0,2,c03fdb89,1dc,b6e81bd0) at choosethread+0x36 > mi_switch(c047f820,0,c03fb1fd,224,c11db5ac) at mi_switch+0x200 > ithread_loop(c11da180,d1d62d48,c03fb0ae,30c,55ff44fd) at ithread_loop+0x256 > fork_exit(c022caf0,c11da180,d1d62d48) at fork_exit+0xc0 > fork_trampoline() at fork_trampoline+0x1a > --- trap 0x1, eip = 0, esp = 0xd1d62d7c, ebp = 0 --- > > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; lapic.id = 01000000 > fault virtual address = 0x38 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xc0253ec7 > stack pointer = 0x10:0xd1d62c54 > frame pointer = 0x10:0xd1d62c68 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 14 (swi7: tty:sio clock) > kernel: type 12 trap, code=0 > Stopped at sched_choose+0x77: movl 0x38(%eax),%eax This is a ULE and SMP panic that Jeff keeps looking for. Seems to be a NULL pointer deference of some sort. > I recall most if not all of these panics occuring when swi7: tty:sio clock > is the current process. These are not completely repeatable, but if I > simply reboot a couple of times, I can get the panic to occur while the > rc scripts are being run. Can you do a 'l *sched_choose+0x77' in gdb on kernel.debug to get the source line corresponding to this panic? -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/