Date: Wed, 21 Mar 2018 18:31:48 -0600 From: Warner Losh <imp@bsdimp.com> To: Bruce Evans <brde@optusnet.com.au> Cc: Warner Losh <imp@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r331298 - head/sys/dev/syscons Message-ID: <CANCZdfoUCimTKFQJgBzZAEva4=vfF69s=nE36OvDEQwDxMmE7A@mail.gmail.com> In-Reply-To: <20180322024846.S4293@besplex.bde.org> References: <201803211447.w2LElDcK091988@repo.freebsd.org> <20180322024846.S4293@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evans <brde@optusnet.com.au> wrote: > On Wed, 21 Mar 2018, Warner Losh wrote: > > Log: >> Unlock giant when calling shutdown_nice() >> > > This breaks the driver. Giant is syscons' driver lock, and also the > interrupt handler lock for at least the atkbd keyboard driver, so vt > sometimes holds the lock for. > OK. I got carried away. You're right. The proper fix is to unlock Giant at the top of kern_reboot() instead. This handles the case where we call shutdown_nice() with no init running and have to call kern_reboot directly. Otherwise it's perfectly fine to just call shutdown_nice() with Giant held since we just signal init from there. So I'll revert this change and make that other change instead. > vt has to sprinkle lots of Giant locking and unlocking where the > corresponding locking is automatic for syscons, but this seems to be > nonsense in its event handler. vt has to acquire Giant when calling > the keyboard driver, but event handling is a call from the keyboard > driver, usually from the interrupt handler with a suitable lock held. > For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard > drivers that use the kbd(4) for the top level use Giant looking since > toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all > keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no > effect since Giant is held by the caller. So vt_kbdevent() calls > vt_processkey() with Giant held despite it not acquiring Giant explicitly; > vt_processkey() calls vt_machine_kbdevent() and that calls various > shutdown functions, all with Giant held. True, and covered above. > Modified: head/sys/dev/syscons/syscons.c >> ============================================================ >> ================== >> --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 >> (r331297) >> +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 >> (r331298) >> @@ -3858,22 +3858,28 @@ next_code: >> >> case RBT: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(&Giant); >> shutdown_nice(0); >> + } >> #endif >> break; >> >> case HALT: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(&Giant); >> shutdown_nice(RB_HALT); >> + } >> #endif >> break; >> >> case PDWN: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(&Giant); >> shutdown_nice(RB_HALT|RB_POWEROFF); >> + } >> #endif >> break; >> > > The new bugs should cause assertion failures. shutdown_nice() just > signals init and returns. Giant is not re-acquired, so assertions > should fail when the caller drops the lock. So the shutdown should > be a nasty panic. > I must have been using vt when I tested it, since I didn't see that. > Transiently dropping the lock is probably not fatal depending on what > the caller does. On x86 with atkbd, nested interrupts are prevented > by masking the in the ATPIC or APIC -- Giant is not really the driver > lock, but just a lock for interfacing between related screen and keyboard > drivers. > > Serial console drivers with fast interrupt handlers have much more > broken locking for ddb special keys. It is invalid to either drop locks > or call the "any" function from a fast interrupt handler, but buggy > serial console drivers calls kbd_alt_break(), and that now calls > shutdown_nice() and other functions that cannot be called from a fast > interrupt handler. ddb keys supply most of the shutdown_nice() > functionality for serial consoles, and there are no escape sequence to > get this without ddb or maybe another debugger, so these bugs don't > affect most users. > It's called it before my changes... I'll make a note to look into this. > Handling this correctly requires much the same fix as an unsafe signal > handler, and fixes have much the same problems -- not much more than > setting a flag is safe, and the flag might never be looked at if the > system is in a bad state. However, if a nice shutdown is possible then > the sytem must be in a good enough state to poll for flags. > > For normal signal handlers, there is no problem sending a signal to init > or at least with setting a flag and waking up some thread to check the > flag. > > I don't quite understand this commit. It should be valid to send a > signal to init() in proc or non-fast ithread context including with > Giant held. shutdown_nice() starts with PROC_LOCK(initproc), so it > would be a LOR to call it with a spinlock held, and is even more > obviously wrong to call it from kbd_alt_break() with interrupts masked > and possibly a spinlock held, but Giant is a special sleep lock that > causes fewer LORs than most sleep locks. > Yes, this commit is wrong. > Actual testing shows that doesn't cause a panic, but it also doesn't > actually unlock for shutdown_nice(), since the lock is acquired twice > and only released once. syscons has much the same extra lock sprinkling > for event handling as vt: > > - intr_event_execute_handlers() acquires Giant and calls atkbdintr() > - atkbdintr() calls sckbdevent() > - sckbdevent() unnecessarily acquires Giant again > - the buggy unlocking drops Giant just once > - shutdown_nice() is called with Giant held > - the buggy unlocking fails to re-acquire Giant > - sckbdevent() releases Giant, leaving it not held > - sckbdevent() returns > - atkbdintr() returns > - intr_event_execute_handlers() releases Giant. This should panic, but > it apparently blocks for long enough for init to shut down first. > > When I trace the last step, I get a panic which might be either from the > different timing or just a bug in kdb. Good point. I think the following change is good for everything except calling shutdown_nice() from a fast interrupt handler with noinit running: diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c index e5ea9644ad3f..564aecd811be 100644 --- a/sys/kern/kern_shutdown.c +++ b/sys/kern/kern_shutdown.c @@ -366,6 +366,12 @@ kern_reboot(int howto) { static int once = 0; + /* + * Drop Giant once and for all. + */ + while (mtx_owned(&Giant)) + mtx_unlock(&Giant); + #if defined(SMP) /* * Bind us to the first CPU so that all shutdown code runs there. Some Comments? Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfoUCimTKFQJgBzZAEva4=vfF69s=nE36OvDEQwDxMmE7A>