From owner-svn-src-head@freebsd.org Wed Mar 21 17:53:26 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE774F5860A; Wed, 21 Mar 2018 17:53:26 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id C61567D739; Wed, 21 Mar 2018 17:53:25 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id B777642C9D5; Thu, 22 Mar 2018 04:53:23 +1100 (AEDT) Date: Thu, 22 Mar 2018 04:53:22 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r331298 - head/sys/dev/syscons In-Reply-To: <201803211447.w2LElDcK091988@repo.freebsd.org> Message-ID: <20180322024846.S4293@besplex.bde.org> References: <201803211447.w2LElDcK091988@repo.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=FNpr/6gs c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=dk6ZlAi4zWyv9OyJo4MA:9 a=CjuIK1q_8ugA:10 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Mar 2018 17:53:27 -0000 On Wed, 21 Mar 2018, Warner Losh wrote: > Log: > Unlock giant when calling shutdown_nice() This breaks the driver. Giant is syscons' driver lock, and also the interrupt handler lock for at least the atkbd keyboard driver, so vt sometimes holds the lock for. vt has to sprinkle lots of Giant locking and unlocking where the corresponding locking is automatic for syscons, but this seems to be nonsense in its event handler. vt has to acquire Giant when calling the keyboard driver, but event handling is a call from the keyboard driver, usually from the interrupt handler with a suitable lock held. For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard drivers that use the kbd(4) for the top level use Giant looking since toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no effect since Giant is held by the caller. So vt_kbdevent() calls vt_processkey() with Giant held despite it not acquiring Giant explicitly; vt_processkey() calls vt_machine_kbdevent() and that calls various shutdown functions, all with Giant held. > Modified: head/sys/dev/syscons/syscons.c > ============================================================================== > --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 (r331297) > +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 (r331298) > @@ -3858,22 +3858,28 @@ next_code: > > case RBT: > #ifndef SC_DISABLE_REBOOT > - if (enable_reboot && !(flags & SCGETC_CN)) > + if (enable_reboot && !(flags & SCGETC_CN)) { > + mtx_unlock(&Giant); > shutdown_nice(0); > + } > #endif > break; > > case HALT: > #ifndef SC_DISABLE_REBOOT > - if (enable_reboot && !(flags & SCGETC_CN)) > + if (enable_reboot && !(flags & SCGETC_CN)) { > + mtx_unlock(&Giant); > shutdown_nice(RB_HALT); > + } > #endif > break; > > case PDWN: > #ifndef SC_DISABLE_REBOOT > - if (enable_reboot && !(flags & SCGETC_CN)) > + if (enable_reboot && !(flags & SCGETC_CN)) { > + mtx_unlock(&Giant); > shutdown_nice(RB_HALT|RB_POWEROFF); > + } > #endif > break; The new bugs should cause assertion failures. shutdown_nice() just signals init and returns. Giant is not re-acquired, so assertions should fail when the caller drops the lock. So the shutdown should be a nasty panic. Transiently dropping the lock is probably not fatal depending on what the caller does. On x86 with atkbd, nested interrupts are prevented by masking the in the ATPIC or APIC -- Giant is not really the driver lock, but just a lock for interfacing between related screen and keyboard drivers. Serial console drivers with fast interrupt handlers have much more broken locking for ddb special keys. It is invalid to either drop locks or call the "any" function from a fast interrupt handler, but buggy serial console drivers calls kbd_alt_break(), and that now calls shutdown_nice() and other functions that cannot be called from a fast interrupt handler. ddb keys supply most of the shutdown_nice() functionality for serial consoles, and there are no escape sequence to get this without ddb or maybe another debugger, so these bugs don't affect most users. Handling this correctly requires much the same fix as an unsafe signal handler, and fixes have much the same problems -- not much more than setting a flag is safe, and the flag might never be looked at if the system is in a bad state. However, if a nice shutdown is possible then the sytem must be in a good enough state to poll for flags. For normal signal handlers, there is no problem sending a signal to init or at least with setting a flag and waking up some thread to check the flag. I don't quite understand this commit. It should be valid to send a signal to init() in proc or non-fast ithread context including with Giant held. shutdown_nice() starts with PROC_LOCK(initproc), so it would be a LOR to call it with a spinlock held, and is even more obviously wrong to call it from kbd_alt_break() with interrupts masked and possibly a spinlock held, but Giant is a special sleep lock that causes fewer LORs than most sleep locks. Actual testing shows that doesn't cause a panic, but it also doesn't actually unlock for shutdown_nice(), since the lock is acquired twice and only released once. syscons has much the same extra lock sprinkling for event handling as vt: - intr_event_execute_handlers() acquires Giant and calls atkbdintr() - atkbdintr() calls sckbdevent() - sckbdevent() unnecessarily acquires Giant again - the buggy unlocking drops Giant just once - shutdown_nice() is called with Giant held - the buggy unlocking fails to re-acquire Giant - sckbdevent() releases Giant, leaving it not held - sckbdevent() returns - atkbdintr() returns - intr_event_execute_handlers() releases Giant. This should panic, but it apparently blocks for long enough for init to shut down first. When I trace the last step, I get a panic which might be either from the different timing or just a bug in kdb. Bruce