From owner-svn-src-all@freebsd.org Thu Oct 6 10:40:20 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 584CBAF6FEC; Thu, 6 Oct 2016 10:40:20 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1A460260; Thu, 6 Oct 2016 10:40:20 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bs662-000KFb-S0; Thu, 06 Oct 2016 13:40:14 +0300 Date: Thu, 6 Oct 2016 13:40:14 +0300 From: Slawa Olhovchenkov To: Bruce Evans Cc: Eric van Gyzen , src-committers@freebsd.org, svn-src-all@freebsd.org, Gleb Smirnoff , svn-src-head@freebsd.org Subject: Re: svn commit: r306346 - head/sys/kern Message-ID: <20161006104014.GE6177@zxy.spb.ru> References: <201609261530.u8QFUUZd020174@repo.freebsd.org> <20161004205600.GN23123@FreeBSD.org> <20161005101932.U984@besplex.bde.org> <20161005204613.GD6177@zxy.spb.ru> <20161006135042.R2235@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161006135042.R2235@besplex.bde.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Oct 2016 10:40:20 -0000 On Thu, Oct 06, 2016 at 02:08:46PM +1100, Bruce Evans wrote: > On Wed, 5 Oct 2016, Slawa Olhovchenkov wrote: > > > On Wed, Oct 05, 2016 at 11:19:10AM +1100, Bruce Evans wrote: > > > >> On Tue, 4 Oct 2016, Gleb Smirnoff wrote: > >> > >>> On Mon, Sep 26, 2016 at 03:30:30PM +0000, Eric van Gyzen wrote: > >>> E> ... > >>> E> Modified: head/sys/kern/kern_mutex.c > >>> E> ============================================================================== > >>> E> --- head/sys/kern/kern_mutex.c Mon Sep 26 15:03:31 2016 (r306345) > >>> E> +++ head/sys/kern/kern_mutex.c Mon Sep 26 15:30:30 2016 (r306346) > >>> E> @@ -924,7 +924,7 @@ __mtx_assert(const volatile uintptr_t *c > >>> E> { > >>> E> const struct mtx *m; > >>> E> > >>> E> - if (panicstr != NULL || dumping) > >>> E> + if (panicstr != NULL || dumping || SCHEDULER_STOPPED()) > >>> E> return; > >>> > >>> I wonder if all this disjunct can be reduced just to SCHEDULER_STOPPED()? > >>> Positive panicstr and dumping imply scheduler stopped. > >> > >> 'dumping' doesn't imply SCHEDULER_STOPPED(). > >> > >> Checking 'dumping' here seems to be just an old bug. It just breaks > >> __mtx_assert(), while all other mutex operations work normally for dumping > >> without panicing. > > > > [...] > > > > Is this related to halted (not reboted) 11.0 after ~^B and `panic`? > > There might be related problems, but I don't see any here. > > > What I see on serial console: > > ===== > > db> panic > > panic: from debugger > > I wouldn't trust panic from the debugger, but it is safer than dump > from the debugger (both are ddb commands, but this is another bug). > > > cpuid = 1 > > KDB: stack backtrace: > > db_trace_self_wrapper() at 0xffffffff8031fadb = db_trace_self_wrapper+0x2b/frame 0xfffffe1f9e198120 > > vpanic() at 0xffffffff804a0302 = vpanic+0x182/frame 0xfffffe1f9e1981a0 > > panic() at 0xffffffff804a0383 = panic+0x43/frame 0xfffffe1f9e198200 > > db_panic() at 0xffffffff8031d987 = db_panic+0x17/frame 0xfffffe1f9e198210 > > db_command() at 0xffffffff8031d019 = db_command+0x299/frame 0xfffffe1f9e1982e0 > > db_command_loop() at 0xffffffff8031cd74 = db_command_loop+0x64/frame 0xfffffe1f9e1982f0 > > db_trap() at 0xffffffff8031fc1b = db_trap+0xdb/frame 0xfffffe1f9e198380 > > kdb_trap() at 0xffffffff804dd8c3 = kdb_trap+0x193/frame 0xfffffe1f9e198410 > > trap() at 0xffffffff806e3065 = trap+0x255/frame 0xfffffe1f9e198620 > > calltrap() at 0xffffffff806cafd1 = calltrap+0x8/frame 0xfffffe1f9e198620 > > --- trap 0x3, rip = 0xffffffff804dd11e, rsp = 0xfffffe1f9e1986f0, rbp = 0xfffffe1f9e198710 --- > > kdb_alt_break_internal() at 0xffffffff804dd11e = kdb_alt_break_internal+0x18e/frame 0xfffffe1f9e198710 > > kdb_alt_break() at 0xffffffff804dcf8b = kdb_alt_break+0xb/frame 0xfffffe1f9e198720 > > uart_intr_rxready() at 0xffffffff803e38a8 = uart_intr_rxready+0x98/frame 0xfffffe1f9e198750 > > uart_intr() at 0xffffffff803e4621 = uart_intr+0x121/frame 0xfffffe1f9e198790 > > intr_event_handle() at 0xffffffff8046c74b = intr_event_handle+0x9b/frame 0xfffffe1f9e1987e0 > > intr_execute_handlers() at 0xffffffff8076d2d8 = intr_execute_handlers+0x48/frame 0xfffffe1f9e198810 > > lapic_handle_intr() at 0xffffffff8077163f = lapic_handle_intr+0x3f/frame 0xfffffe1f9e198830 > > Xapic_isr1() at 0xffffffff806cb6b7 = Xapic_isr1+0xb7/frame 0xfffffe1f9e198830 > > --- interrupt, rip = 0xffffffff8032fedf, rsp = 0xfffffe1f9e198900, rbp = 0xfffffe1f9e198940 --- > > acpi_cpu_idle() at 0xffffffff8032fedf = acpi_cpu_idle+0x2af/frame 0xfffffe1f9e198940 > > cpu_idle_acpi() at 0xffffffff8076ad1f = cpu_idle_acpi+0x3f/frame 0xfffffe1f9e198960 > > cpu_idle() at 0xffffffff8076adc5 = cpu_idle+0x95/frame 0xfffffe1f9e198980 > > sched_idletd() at 0xffffffff804cbbe5 = sched_idletd+0x495/frame 0xfffffe1f9e198a70 > > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e198ab0 > > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 0xfffffe1f9e198ab0 > > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > This looks like a normal kdb entry then a not so normal panic from ddb, > but no problems. Yes, I am just capture all output from console after command (`panic`). > > Uptime: 1d4h53m19s > > Dumping 12148 out of 131020 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Dump complete > > mps2: Sending StopUnit: path (xpt0:mps2:0:14:ffffffff): handle 12 > > mps2: Incrementing SSU count > > mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff): handle 9 > > mps2: Incrementing SSU count > > ===== > > > > This is normal reboot (by /sbin/reboot): > > Is the above just a hung dump from reboot, before going near ddb? That > case should work, but perhaps it needs to be more careful about waiting > for the other CPUs. Just stopping them is no good since it gives an > even more fragile environment, like panicing or entering ddb. Above is attempt to collect dump and reboot from KDB. Similar output exist from INAVRIANT: ==== panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL cpuid = 4 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff8032467b = db_trace_self_wrapper+0x2b/frame 0xfffffe1f9e1f8730 vpanic() at 0xffffffff804b5672 = vpanic+0x182/frame 0xfffffe1f9e1f87b0 kassert_panic() at 0xffffffff804b54e6 = kassert_panic+0x126/frame 0xfffffe1f9e1f8820 tcp_usr_detach() at 0xffffffff806564dc = tcp_usr_detach+0x1bc/frame 0xfffffe1f9e1f8850 sofree() at 0xffffffff8053de66 = sofree+0x1a6/frame 0xfffffe1f9e1f8880 tcp_close() at 0xffffffff8064dd8e = tcp_close+0x11e/frame 0xfffffe1f9e1f88b0 tcp_timer_2msl() at 0xffffffff80653c28 = tcp_timer_2msl+0x278/frame 0xfffffe1f9e1f88e0 softclock_call_cc() at 0xffffffff804cbacc = softclock_call_cc+0x19c/frame 0xfffffe1f9e1f89c0 softclock() at 0xffffffff804cbec7 = softclock+0x47/frame 0xfffffe1f9e1f89e0 intr_event_execute_handlers() at 0xffffffff8047aa86 = intr_event_execute_handlers+0x96/frame 0xfffffe1f9e1f8a20 ithread_loop() at 0xffffffff8047b106 = ithread_loop+0xa6/frame 0xfffffe1f9e1f8a70 fork_exit() at 0xffffffff804781b4 = fork_exit+0x84/frame 0xfffffe1f9e1f8ab0 fork_trampoline() at 0xffffffff80713fce = fork_trampoline+0xe/frame 0xfffffe1f9e1f8ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Uptime: 54m39s Dumping 7780 out of 131019 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Dump complete mps2: Sending StopUnit: path (xpt0:mps2:0:14:ffffffff): handle 12 mps2: Incrementing SSU count mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff): handle 9 mps2: Incrementing SSU count ==== And need power reset for reboot. > > > > === > > Sending StopUnit: path (xpt0:mps2:0:14:ffffffff): handle 13 > > mps2: Incrementing SSU count > > mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff): handle 9 > > mps2: Incrementing SSU count > > mps2: Decrementing SSU count. > > mps2: Completing stop unit for (xpt0:mps2:0:18:ffffffff): > > mps2: Decrementing SSU count. > > mps2: Completing stop unit for (xpt0:mps2:0:14:ffffffff): > > === > > > > ==== > > mps2: lagg0: link state changed to DOWN > > Sending StopUnit: path (xpt0:mps2:0:14:ffffffff): handle 12 > > mps2: Incrementing SSU count > > mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff): handle 9 > > mps2: Incrementing SSU count > > mps2: Decrementing SSU count. > > mps2: Completing stop unit for (xpt0:mps2:0:18:ffffffff): > > mps2: Decrementing SSU count. > > mps2: Completing stop unit for (xpt0:mps2:0:14:ffffffff): > > ==== > > Bruce > _______________________________________________ > svn-src-all@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-all > To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"