Date: Thu, 22 Mar 2018 04:53:22 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Warner Losh <imp@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r331298 - head/sys/dev/syscons Message-ID: <20180322024846.S4293@besplex.bde.org> In-Reply-To: <201803211447.w2LElDcK091988@repo.freebsd.org> References: <201803211447.w2LElDcK091988@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 21 Mar 2018, Warner Losh wrote: > Log: > Unlock giant when calling shutdown_nice() This breaks the driver. Giant is syscons' driver lock, and also the interrupt handler lock for at least the atkbd keyboard driver, so vt sometimes holds the lock for. vt has to sprinkle lots of Giant locking and unlocking where the corresponding locking is automatic for syscons, but this seems to be nonsense in its event handler. vt has to acquire Giant when calling the keyboard driver, but event handling is a call from the keyboard driver, usually from the interrupt handler with a suitable lock held. For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard drivers that use the kbd(4) for the top level use Giant looking since toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no effect since Giant is held by the caller. So vt_kbdevent() calls vt_processkey() with Giant held despite it not acquiring Giant explicitly; vt_processkey() calls vt_machine_kbdevent() and that calls various shutdown functions, all with Giant held. > Modified: head/sys/dev/syscons/syscons.c > ============================================================================== > --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 (r331297) > +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 (r331298) > @@ -3858,22 +3858,28 @@ next_code: > > case RBT: > #ifndef SC_DISABLE_REBOOT > - if (enable_reboot && !(flags & SCGETC_CN)) > + if (enable_reboot && !(flags & SCGETC_CN)) { > + mtx_unlock(&Giant); > shutdown_nice(0); > + } > #endif > break; > > case HALT: > #ifndef SC_DISABLE_REBOOT > - if (enable_reboot && !(flags & SCGETC_CN)) > + if (enable_reboot && !(flags & SCGETC_CN)) { > + mtx_unlock(&Giant); > shutdown_nice(RB_HALT); > + } > #endif > break; > > case PDWN: > #ifndef SC_DISABLE_REBOOT > - if (enable_reboot && !(flags & SCGETC_CN)) > + if (enable_reboot && !(flags & SCGETC_CN)) { > + mtx_unlock(&Giant); > shutdown_nice(RB_HALT|RB_POWEROFF); > + } > #endif > break; The new bugs should cause assertion failures. shutdown_nice() just signals init and returns. Giant is not re-acquired, so assertions should fail when the caller drops the lock. So the shutdown should be a nasty panic. Transiently dropping the lock is probably not fatal depending on what the caller does. On x86 with atkbd, nested interrupts are prevented by masking the in the ATPIC or APIC -- Giant is not really the driver lock, but just a lock for interfacing between related screen and keyboard drivers. Serial console drivers with fast interrupt handlers have much more broken locking for ddb special keys. It is invalid to either drop locks or call the "any" function from a fast interrupt handler, but buggy serial console drivers calls kbd_alt_break(), and that now calls shutdown_nice() and other functions that cannot be called from a fast interrupt handler. ddb keys supply most of the shutdown_nice() functionality for serial consoles, and there are no escape sequence to get this without ddb or maybe another debugger, so these bugs don't affect most users. Handling this correctly requires much the same fix as an unsafe signal handler, and fixes have much the same problems -- not much more than setting a flag is safe, and the flag might never be looked at if the system is in a bad state. However, if a nice shutdown is possible then the sytem must be in a good enough state to poll for flags. For normal signal handlers, there is no problem sending a signal to init or at least with setting a flag and waking up some thread to check the flag. I don't quite understand this commit. It should be valid to send a signal to init() in proc or non-fast ithread context including with Giant held. shutdown_nice() starts with PROC_LOCK(initproc), so it would be a LOR to call it with a spinlock held, and is even more obviously wrong to call it from kbd_alt_break() with interrupts masked and possibly a spinlock held, but Giant is a special sleep lock that causes fewer LORs than most sleep locks. Actual testing shows that doesn't cause a panic, but it also doesn't actually unlock for shutdown_nice(), since the lock is acquired twice and only released once. syscons has much the same extra lock sprinkling for event handling as vt: - intr_event_execute_handlers() acquires Giant and calls atkbdintr() - atkbdintr() calls sckbdevent() - sckbdevent() unnecessarily acquires Giant again - the buggy unlocking drops Giant just once - shutdown_nice() is called with Giant held - the buggy unlocking fails to re-acquire Giant - sckbdevent() releases Giant, leaving it not held - sckbdevent() returns - atkbdintr() returns - intr_event_execute_handlers() releases Giant. This should panic, but it apparently blocks for long enough for init to shut down first. When I trace the last step, I get a panic which might be either from the different timing or just a bug in kdb. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180322024846.S4293>