Date: Thu, 22 Mar 2018 05:31:18 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Bruce Evans <brde@optusnet.com.au> Cc: Warner Losh <imp@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r331298 - head/sys/dev/syscons Message-ID: <20180322050628.U5175@besplex.bde.org> In-Reply-To: <20180322024846.S4293@besplex.bde.org> References: <201803211447.w2LElDcK091988@repo.freebsd.org> <20180322024846.S4293@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 22 Mar 2018, Bruce Evans wrote: > On Wed, 21 Mar 2018, Warner Losh wrote: > >> Log: >> Unlock giant when calling shutdown_nice() > ... > This breaks the driver. Giant is syscons' driver lock, and also the > interrupt handler lock for at least the atkbd keyboard driver, so vt > sometimes holds the lock for. [That should have been "... lock too".] > ... > Actual testing shows that doesn't cause a panic, but it also doesn't > actually unlock for shutdown_nice(), since the lock is acquired twice > and only released once. syscons has much the same extra lock sprinkling > for event handling as vt: > > - intr_event_execute_handlers() acquires Giant and calls atkbdintr() > - atkbdintr() calls sckbdevent() > - sckbdevent() unnecessarily acquires Giant again > - the buggy unlocking drops Giant just once > - shutdown_nice() is called with Giant held > - the buggy unlocking fails to re-acquire Giant > - sckbdevent() releases Giant, leaving it not held > - sckbdevent() returns > - atkbdintr() returns > - intr_event_execute_handlers() releases Giant. This should panic, but > it apparently blocks for long enough for init to shut down first. > > When I trace the last step, I get a panic which might be either from the > different timing or just a bug in kdb. Testing with a kernel with other bugs fixed shows that the invariants violation really is detected, causing a panic instead of a nice shutdown for Ctrl-Alt-Del, but recursive panics pile up and printf() in -current is too broken to print anything. A better kernel prints: XX ppppppppppppppppppppppppppkernel trap 12 with interrupts disabled XX panic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic XX panic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic XX panic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic XX panic: kernel trap doesn't have ucrepanic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic where some messages are lost and some are duplicated (there are 8 CPUs). I got control by putting a breakpoint after stop_cpus_hard() in vpanic(). (A breakpoint at panic() crashes, probably for multiple CPUs hitting it, though this is supposed to be fixed in the test version.) The backtrace is then: XX Breakpoint at vpanic+0x4a: popl %ecx XX db> t XX Tracing pid 11 tid 100042 td 0xd6a61360 XX vpanic(c0916c91,d6685ab4,d6685ab4,c098f050,c098f040,...) at vpanic+0x4a/frame 0xd6685a8c XX kassert_panic(c0916c91,c093b956,c0930a24,c09066cf,557) at kassert_panic+0x49/frame 0xd6685aa8 XX witness_unlock(c098f040,8,c09066c6,557) at witness_unlock+0xe7/frame 0xd6685af0 XX __mtx_unlock_flags(c098f050,0,c09066c6,557) at __mtx_unlock_flags+0x65/frame 0xd6685b14 XX intr_event_execute_handlers(c8f466b0) at intr_event_execute_handlers+0xed/frame 0xd6685b40 XX ithread_execute_handlers(c8f466b0,0,80202,d6a61360,c8f466b0,...) at ithread_execute_handlers+0x21/frame 0xd6685b54 XX ithread_loop(d6cda250,d6685ba8,0,d6cda250,c06a2c56,...) at ithread_loop+0x5f/frame 0xd6685b74 XX fork_exit(c06a2c56,d6cda250,d6685ba8) at fork_exit+0x83/frame 0xd6685b94 XX fork_trampoline() at fork_trampoline+0x8 The stack trace is messed up are mis-decoded despite attempts to avoid this (use i386, don't use clang, and turn off auto-inlining...). panic() is not shown, and args passed in registers are not shown. There are just enough args to find the panic message. It is as expected: XX db> x/s 0xd6685ab4 XX 0xd6685ab4: V\271\223\300$\012\223\300\317f\220\300W\005 XX db> x/s 0xc0916c91 XX __func__.16466+0x1c41: lock (%s) %s not locked @ %s:%d Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180322050628.U5175>