From owner-freebsd-current@FreeBSD.ORG Mon Jan 23 08:22:01 2012 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FAD5106566B; Mon, 23 Jan 2012 08:22:01 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id AAD268FC08; Mon, 23 Jan 2012 08:22:00 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA00672; Mon, 23 Jan 2012 10:21:59 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1RpFA6-000PTR-Rg; Mon, 23 Jan 2012 10:21:58 +0200 Message-ID: <4F1D18A5.8010006@FreeBSD.org> Date: Mon, 23 Jan 2012 10:21:57 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111222 Thunderbird/9.0 MIME-Version: 1.0 To: Gleb Smirnoff References: <20120117110242.GD12760@glebius.int.ru> <20120120154158.GD16676@FreeBSD.org> <4F1ABFF3.9090305@FreeBSD.org> <20120122163539.GF16676@glebius.int.ru> In-Reply-To: <20120122163539.GF16676@glebius.int.ru> X-Enigmail-Version: undefined Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: current@FreeBSD.org Subject: Re: new panic in cpu_reset() with WITNESS X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jan 2012 08:22:01 -0000 on 22/01/2012 18:35 Gleb Smirnoff said the following: > On Sat, Jan 21, 2012 at 03:38:59PM +0200, Andriy Gapon wrote: > A> on 20/01/2012 17:41 Gleb Smirnoff said the following: > A> > > A> > On Tue, Jan 17, 2012 at 03:02:42PM +0400, Gleb Smirnoff wrote: > A> > T> New panic has been introduced somewhere between > A> > T> r229851 and r229932, that happens on shutdown if > A> > T> kernel has WITNESS and doesn't have WITNESS_SKIPSPIN. > A> > > A> > I've run through binary search and panic was introduced > A> > by r229854. > A> > > A> > A> Which means that it was introduced by one of the earlier commits (probably > A> mine). Thank you for this investigative work! > A> Although, to be frank and honest, I still don't see how the problem could be > A> a non-issue before. > > I can confirm that I can reboot succesfully with a new kernel > and kern.stop_scheduler_on_panic=0. This is very puzzling. You observation means that stop_scheduler_on_panic has an effect on the code outside the panic path. I reviewed the SCHEDULER_STOPPED() changes and I still can not see how that could be possible (modulo compiler bugs). Unfortunately, I also can not reproduce the problem here - without WITNESS_SKIPSPIN I am running into a panic[*] during boot. BTW, do you get any other LOR reports _before_ you do a reboot? Can you try to change printfs in witness to db_printfs? Perhaps this will allow to get the details of the LOR in uart_cnputc. Maybe that will reveal some important additional details. Finally, can you try the patch from my other post? [*] The panic: ... Root mount waiting for: usbus0 panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ /usr/src/sys/kern/kern_cons.c:500 cpuid = 0 curthread: 0xfffffe000230b900 stack: 0xffffff802cda3000 - 0xffffff802cda7000 stack pointer: 0xffffff802cda5c50 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff802bf41a = db_trace_self_wrapper+0x2a kdb_backtrace() at 0xffffffff8053382a = kdb_backtrace+0x3a panic() at 0xffffffff804f8690 = panic+0x260 _mtx_lock_spin_flags() at 0xffffffff804e5eff = _mtx_lock_spin_flags+0xef cnputs() at 0xffffffff804b0ea6 = cnputs+0x36 putchar() at 0xffffffff8053887f = putchar+0x24f kvprintf() at 0xffffffff80536b1b = kvprintf+0x9b vprintf() at 0xffffffff80538225 = vprintf+0x85 printf() at 0xffffffff805382f7 = printf+0x67 witness_checkorder() at 0xffffffff8054c361 = witness_checkorder+0x821 _mtx_lock_spin_flags() at 0xffffffff804e5f1d = _mtx_lock_spin_flags+0x10d uart_cnputc() at 0xffffffff803c8e6c = uart_cnputc+0x3c cnputc() at 0xffffffff804b0acd = cnputc+0x6d cnputs() at 0xffffffff804b0ebb = cnputs+0x4b putchar() at 0xffffffff8053887f = putchar+0x24f kvprintf() at 0xffffffff80536b1b = kvprintf+0x9b vprintf() at 0xffffffff80538225 = vprintf+0x85 printf() at 0xffffffff805382f7 = printf+0x67 witness_checkorder() at 0xffffffff8054c361 = witness_checkorder+0x821 _mtx_lock_spin_flags() at 0xffffffff804e5f1d = _mtx_lock_spin_flags+0x10d cpu_new_callout() at 0xffffffff806fadbb = cpu_new_callout+0x9b callout_cc_add() at 0xffffffff8050e379 = callout_cc_add+0x129 callout_reset_on() at 0xffffffff8050efcc = callout_reset_on+0x24c sleepq_set_timeout() at 0xffffffff8053ff56 = sleepq_set_timeout+0x106 _sleep() at 0xffffffff805036f2 = _sleep+0x342 pause() at 0xffffffff80503920 = pause+0x90 usb_pause_mtx() at 0xffffffff803ee841 = usb_pause_mtx+0x71 usbd_req_set_address() at 0xffffffff803e9f1c = usbd_req_set_address+0xac usb_alloc_device() at 0xffffffff803def89 = usb_alloc_device+0x3d9 uhub_explore() at 0xffffffff803e5b63 = uhub_explore+0x3a3 usb_bus_explore() at 0xffffffff803d4187 = usb_bus_explore+0xd7 usb_process() at 0xffffffff803e8bc1 = usb_process+0xa1 fork_exit() at 0xffffffff804c801a = fork_exit+0x1aa fork_trampoline() at 0xffffffff806bf3ee = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff802cda6d00, rbp = 0 --- KDB: enter: panic [ thread pid 15 tid 100031 ] Stopped at 0xffffffff8053348d = kdb_enter+0x3d: movq $0,0x11212d0(%rip) db> show locks exclusive sx USB config SX lock (USB config SX lock) r = 0 (0xfffffe0005040070) locked @ /usr/src/sys/dev/usb/usb_device.c:2643 exclusive spin mutex callout (callout) r = 0 (0xffffffff8160b3f8) locked @ /usr/src/sys/kern/kern_timeout.c:409 exclusive spin mutex sleepq chain (sleepq chain) r = 0 (0xffffffff81655668) locked @ /usr/src/sys/kern/subr_sleepqueue.c:237 -- Andriy Gapon