From owner-freebsd-current@FreeBSD.ORG Tue Jan 17 17:57:37 2012 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 461451065672; Tue, 17 Jan 2012 17:57:37 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id A0BBF8FC21; Tue, 17 Jan 2012 17:57:36 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q0HHvZ9o023355; Tue, 17 Jan 2012 21:57:35 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q0HHvZdB023354; Tue, 17 Jan 2012 21:57:35 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Tue, 17 Jan 2012 21:57:35 +0400 From: Gleb Smirnoff To: mdf@FreeBSD.org Message-ID: <20120117175735.GJ12760@FreeBSD.org> References: <20120117110242.GD12760@glebius.int.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: current@FreeBSD.org Subject: Re: new panic in cpu_reset() with WITNESS X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2012 17:57:37 -0000 On Tue, Jan 17, 2012 at 07:34:23AM -0800, mdf@freebsd.org wrote: m> 2012/1/17 Gleb Smirnoff : m> > šNew panic has been introduced somewhere between m> > r229851 and r229932, that happens on shutdown if m> > kernel has WITNESS and doesn't have WITNESS_SKIPSPIN. m> > m> > Uptime: 1h0m17s m> > Rebooting... m> > panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ /usr/src/head/sys/kern/kern_cons.c:500 m> > cpuid = 0 m> > KDB: enter: panic m> > [ thread pid 1 tid 100001 ] m> > Stopped at š š škdb_enter+0x3b: movq š š$0,0x514d32(%rip) m> > db> m> > db> bt m> > Tracing pid 1 tid 100001 td 0xfffffe0001d5e000 m> > kdb_enter() at kdb_enter+0x3b m> > panic() at panic+0x1c7 m> > _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x10f m> > cnputs() at cnputs+0x7a m> > putchar() at putchar+0x11f m> > kvprintf() at kvprintf+0x83 m> > vprintf() at vprintf+0x85 m> > printf() at printf+0x67 m> > witness_checkorder() at witness_checkorder+0x773 m> > _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x99 m> > uart_cnputc() at uart_cnputc+0x3e m> > cnputc() at cnputc+0x4c m> > cnputs() at cnputs+0x26 m> > putchar() at putchar+0x11f m> > kvprintf() at kvprintf+0x83 m> > vprintf() at vprintf+0x85 m> > printf() at printf+0x67 m> > cpu_reset() at cpu_reset+0x81 m> > kern_reboot() at kern_reboot+0x3a5 m> > --More--^M š š š š^Msys_reboot() at sys_reboot+0x42 m> > amd64_syscall() at amd64_syscall+0x39e m> > Xfast_syscall() at Xfast_syscall+0xf7 m> > --- syscall (55, FreeBSD ELF64, sys_reboot), rip = 0x40ea3c, rsp = 0x7fffffffd6d8, rbp = 0x49 --- m> > db> m> > db> show locks m> > exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff809bc560) locked @ /usr/src/head/sys/kern/kern_module.c:101 m> > exclusive spin mutex smp rendezvous (smp rendezvous) r = 0 (0xffffffff80a08840) locked @ /usr/src/head/sys/kern/kern_shutdown.c:542 m> > db> m> > m> > So the problem is that we are holding smp rendezvous mutex during the cpu_reset(). m> > No mutexes should be obtained after it. However, since cpu_reset() does priting m> > we obtain cnputs_mtx, and later obtain uart_hwmtx. The latter is hardcoded in m> > the subr_witness.c as mutex to obtain before smp rendezvous, this triggers m> > yet another printf from witness, that finally panics due to recursing on m> > cnputs_mtx. m> m> At $WORK we explicitly marked cnputs_mtx as NO_WITNESS since it didn't m> seem possible to fit it into the heirarchy in any sane way, since a m> print can come from basically anywhere. m> m> If anyone has a better fix, that'd be great, but I haven't been able m> to think of one. Setting NO_WITNESS on cnputs_mtx won't help for the above problem, IMHO. -- Totus tuus, Glebius.