From owner-freebsd-stable@FreeBSD.ORG Mon Oct 26 14:05:50 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C18E31065697 for ; Mon, 26 Oct 2009 14:05:50 +0000 (UTC) (envelope-from jbozza@mindsites.com) Received: from mail.thinkburst.com (mail.thinkburst.com [204.49.104.46]) by mx1.freebsd.org (Postfix) with ESMTP id 7BAC48FC1B for ; Mon, 26 Oct 2009 14:05:50 +0000 (UTC) Received: from mailgate.mindsites.net (gateway.mindsites.net [204.49.104.36]) by mail.thinkburst.com (Postfix) with ESMTP id F081E1CC42; Mon, 26 Oct 2009 09:05:49 -0500 (CDT) Received: from remote.mindsites.com (unknown [10.1.1.5]) by mailgate.mindsites.net (Postfix) with ESMTP id D80C517040; Mon, 26 Oct 2009 09:05:49 -0500 (CDT) Received: from ATLAS.msg.local ([fe80::48f5:88b0:6093:4f67]) by ATLAS.msg.local ([fe80::48f5:88b0:6093:4f67%10]) with mapi; Mon, 26 Oct 2009 09:05:49 -0500 From: Jaime Bozza To: Dylan Cochran Date: Mon, 26 Oct 2009 09:05:48 -0500 Thread-Topic: Possible scheduler (SCHED_ULE) bug? Thread-Index: AcpUH3evDyb48/J2TeGV319jIhQUfQCJUuVw Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "freebsd-stable@freebsd.org" Subject: RE: Possible scheduler (SCHED_ULE) bug? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Oct 2009 14:05:50 -0000 Sincerely, Jaime Bozza MindSites Group, LLC From: Dylan Cochran [mailto:heliocentric@gmail.com] > Superficially, this seams identical to a deadlock I reported for > 7.1-RC1. Would you mind compiling a kernel with these options: >=20 > > KDB: stack backtrace: > db_trace_self_wrapper(c0b55b52,e66e0ae0,c07615e9,c0b50617,8ca93,...) > at db_trace_self_wrapper+0x26 > kdb_backtrace(c0b50617,8ca93,0,c41a7690,2,...) at kdb_backtrace+0x29 > hardclock(0,c07ff29d,0,0,4,...) at hardclock+0x1f9 > lapic_handle_timer(e66e0b08) at lapic_handle_timer+0x9c > Xtimerint() at Xtimerint+0x1f > --- interrupt, eip =3D 0xc07ff29d, esp =3D 0xe66e0b48, ebp =3D 0xe66e0c34= --- > kern_sendfile(c41a7690,e66e0cfc,0,0,0,...) at kern_sendfile+0x90d > do_sendfile(e66e0d2c,c0aba265,c41a7690,e66e0cfc,20,...) at > do_sendfile+0xb1 > sendfile(c41a7690,e66e0cfc,20,16,e66e0d2c,...) at sendfile+0x13 > syscall(e66e0d38) at syscall+0x335 > Xint0x80_syscall() at Xint0x80_syscall+0x20 > --- syscall (393, FreeBSD ELF32, sendfile), eip =3D 0x282cb0cb, esp =3D > 0xbfbfc7cc, ebp =3D 0xbfbfe848 --- > KDB: enter: watchdog timeout >=20 > You can type 'reboot' to reboot the machine (in my case, panic would > not work, so a useful dump wasn't in the cards) Different offset on mine, but of course I'm using a different kernel. =20 kern_sendfile+0x6ad do_sendfile+0xb1 sendfile+0x13 Luckily, I was able to get a panic, so I have all the files necessary to de= bug. Here's the backtrace: (kgdb) backtrace #0 doadump () at pcpu.h:196 #1 0xc07f2c57 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:4= 18 #2 0xc07f2f62 in panic (fmt=3DVariable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0497e47 in db_panic (addr=3DCould not find the frame base for "db_pa= nic". ) at /usr/src/sys/ddb/db_command.c:446 #4 0xc04985bc in db_command (last_cmdp=3D0xc0ca9154, cmd_table=3D0x0, dopa= ger=3D1) at /usr/src/sys/ddb/db_command.c:413 #5 0xc04986ca in db_command_loop () at /usr/src/sys/ddb/db_command.c:466 #6 0xc049a17d in db_trap (type=3D3, code=3D0) at /usr/src/sys/ddb/db_main.= c:228 #7 0xc081fdf6 in kdb_trap (type=3D3, code=3D0, tf=3D0xc72e2a5c) at /usr/sr= c/sys/kern/subr_kdb.c:524 #8 0xc0b01b9b in trap (frame=3D0xc72e2a5c) at /usr/src/sys/i386/i386/trap.= c:692 #9 0xc0ae58fb in calltrap () at /usr/src/sys/i386/i386/exception.s:166 #10 0xc081ff7a in kdb_enter_why (why=3D0xc0b677b2 "watchdog", msg=3D0xc0b7e= f1d "watchdog timeout") at cpufunc.h:60 #11 0xc07b0cad in hardclock (usermode=3D0, pc=3D3229966301) at /usr/src/sys= /kern/kern_clock.c:640 #12 0xc0aedf1c in lapic_handle_timer (frame=3D0xc72e2afc) at /usr/src/sys/i= 386/i386/local_apic.c:785 #13 0xc0ae5edf in Xtimerint () at apic_vector.s:108 #14 0xc0855fdd in kern_sendfile (td=3D0xc771db40, uap=3D0xc72e2cfc, hdr_uio= =3D0x0, trl_uio=3D0x0, compat=3D0) at atomic.h:160 #15 0xc0856d31 in do_sendfile (td=3D0xc771db40, uap=3D0xc72e2cfc, compat=3D= 0) at /usr/src/sys/kern/uipc_syscalls.c:1775 #16 0xc0856dd3 in sendfile (td=3D0xc771db40, uap=3D0xc72e2cfc) at /usr/src/= sys/kern/uipc_syscalls.c:1746 #17 0xc0b01365 in syscall (frame=3D0xc72e2d38) at /usr/src/sys/i386/i386/tr= ap.c:1094 #18 0xc0ae5960 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s= :262 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) This is all a bit new to me (debugging, etc), so let me know if you need an= ything else! Jaime