From owner-svn-src-all@freebsd.org Thu Mar 22 00:31:50 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1E8B4F529D2 for ; Thu, 22 Mar 2018 00:31:50 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-it0-x233.google.com (mail-it0-x233.google.com [IPv6:2607:f8b0:4001:c0b::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9BC3A71818 for ; Thu, 22 Mar 2018 00:31:49 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-it0-x233.google.com with SMTP id e195-v6so9141483ita.5 for ; Wed, 21 Mar 2018 17:31:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=JbSV3/lRfw5JBEsHaH3glG6/IaMh4f3pjo0VrqFDHRU=; b=qIzeEpujJgrpmSgKKG4XcPwJC95AxMC2aLT0ohIzc3V69URN+yX58FmizJNy4u/ZVq aJHFZqKvr6oy/oNp2LbEUudaxB1ztxfg6lDhchy5OOwLILbMi6mWXOXxQmH1JjimJajO XRidC3SvE29IyeHS5WY6GluEO2KNRAMf8bQe5UWBl+Qv8/C3gzEJOtmEodRPN93Pzig1 AzNEyMM3aInVcXBt8Nu9TwO33ZEFN8daDjbWcpwyrznpWQd58DS4Kzy2ZLP5Z3r1MEXd ds/0KNEU6gwmmgj2DfjfOibxu9dvcO2eHFsxm5uq7sgENceWQmKoAVSkLeF3KiXHSK23 gY0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=JbSV3/lRfw5JBEsHaH3glG6/IaMh4f3pjo0VrqFDHRU=; b=mRW19X+EYnoBPCI4iHeKjIxGgWyJZm8NUmC08o7rRYddCFj8y/r+PALXR5J+IMA0gG tnZ3TkxGJaeMfheFHlANHsy5D647AzL2Bx6DjoUBxczyhRwijD68wM8DoqxK4fYESRO5 vEnLr6FWQA8AiolbWm1EbNhMqsc8EajH4WSDr2RHWiQ08Mx2xfHu+LLyVQRTfOi+6pr5 BTlNDhMUT+SIvUTJ79rWiqvGy/VV87z+BgRkx9IqO8LkHOv/h/Rbe/GWV9QaGdRgHGIR aUIR8SAQsL8+iOIc85bYaCqaeXBI3kzRiUUwDCDrlnzbGPXbO+OTSJxKd6VHectRCO+o kM5A== X-Gm-Message-State: AElRT7GwlgjWHO9t418FU+OwS9chcSa4JWkrk9Mc2OdcTdogb/cHo2FH 2IyB+dG1D0/ZW9NV2huel6VpAUzrhY8XwW9h94WXLA== X-Google-Smtp-Source: AG47ELu+ZP4esfZDRRlBqtGvPsDPJmXVjpZALxSuRhxPTyxomdiFycXPAWniwHsE940OCFmTTaE7Cqz8qhyUaiH4LiE= X-Received: by 2002:a24:fa83:: with SMTP id v125-v6mr6779148ith.36.1521678708767; Wed, 21 Mar 2018 17:31:48 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.79.203.196 with HTTP; Wed, 21 Mar 2018 17:31:48 -0700 (PDT) X-Originating-IP: [2603:300b:6:5100:1052:acc7:f9de:2b6d] In-Reply-To: <20180322024846.S4293@besplex.bde.org> References: <201803211447.w2LElDcK091988@repo.freebsd.org> <20180322024846.S4293@besplex.bde.org> From: Warner Losh Date: Wed, 21 Mar 2018 18:31:48 -0600 X-Google-Sender-Auth: ISadeU0shefjGu9BpBZmUy76Rpk Message-ID: Subject: Re: svn commit: r331298 - head/sys/dev/syscons To: Bruce Evans Cc: Warner Losh , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2018 00:31:50 -0000 On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evans wrote: > On Wed, 21 Mar 2018, Warner Losh wrote: > > Log: >> Unlock giant when calling shutdown_nice() >> > > This breaks the driver. Giant is syscons' driver lock, and also the > interrupt handler lock for at least the atkbd keyboard driver, so vt > sometimes holds the lock for. > OK. I got carried away. You're right. The proper fix is to unlock Giant at the top of kern_reboot() instead. This handles the case where we call shutdown_nice() with no init running and have to call kern_reboot directly. Otherwise it's perfectly fine to just call shutdown_nice() with Giant held since we just signal init from there. So I'll revert this change and make that other change instead. > vt has to sprinkle lots of Giant locking and unlocking where the > corresponding locking is automatic for syscons, but this seems to be > nonsense in its event handler. vt has to acquire Giant when calling > the keyboard driver, but event handling is a call from the keyboard > driver, usually from the interrupt handler with a suitable lock held. > For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard > drivers that use the kbd(4) for the top level use Giant looking since > toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all > keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no > effect since Giant is held by the caller. So vt_kbdevent() calls > vt_processkey() with Giant held despite it not acquiring Giant explicitly; > vt_processkey() calls vt_machine_kbdevent() and that calls various > shutdown functions, all with Giant held. True, and covered above. > Modified: head/sys/dev/syscons/syscons.c >> ============================================================ >> ================== >> --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 >> (r331297) >> +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 >> (r331298) >> @@ -3858,22 +3858,28 @@ next_code: >> >> case RBT: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(&Giant); >> shutdown_nice(0); >> + } >> #endif >> break; >> >> case HALT: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(&Giant); >> shutdown_nice(RB_HALT); >> + } >> #endif >> break; >> >> case PDWN: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(&Giant); >> shutdown_nice(RB_HALT|RB_POWEROFF); >> + } >> #endif >> break; >> > > The new bugs should cause assertion failures. shutdown_nice() just > signals init and returns. Giant is not re-acquired, so assertions > should fail when the caller drops the lock. So the shutdown should > be a nasty panic. > I must have been using vt when I tested it, since I didn't see that. > Transiently dropping the lock is probably not fatal depending on what > the caller does. On x86 with atkbd, nested interrupts are prevented > by masking the in the ATPIC or APIC -- Giant is not really the driver > lock, but just a lock for interfacing between related screen and keyboard > drivers. > > Serial console drivers with fast interrupt handlers have much more > broken locking for ddb special keys. It is invalid to either drop locks > or call the "any" function from a fast interrupt handler, but buggy > serial console drivers calls kbd_alt_break(), and that now calls > shutdown_nice() and other functions that cannot be called from a fast > interrupt handler. ddb keys supply most of the shutdown_nice() > functionality for serial consoles, and there are no escape sequence to > get this without ddb or maybe another debugger, so these bugs don't > affect most users. > It's called it before my changes... I'll make a note to look into this. > Handling this correctly requires much the same fix as an unsafe signal > handler, and fixes have much the same problems -- not much more than > setting a flag is safe, and the flag might never be looked at if the > system is in a bad state. However, if a nice shutdown is possible then > the sytem must be in a good enough state to poll for flags. > > For normal signal handlers, there is no problem sending a signal to init > or at least with setting a flag and waking up some thread to check the > flag. > > I don't quite understand this commit. It should be valid to send a > signal to init() in proc or non-fast ithread context including with > Giant held. shutdown_nice() starts with PROC_LOCK(initproc), so it > would be a LOR to call it with a spinlock held, and is even more > obviously wrong to call it from kbd_alt_break() with interrupts masked > and possibly a spinlock held, but Giant is a special sleep lock that > causes fewer LORs than most sleep locks. > Yes, this commit is wrong. > Actual testing shows that doesn't cause a panic, but it also doesn't > actually unlock for shutdown_nice(), since the lock is acquired twice > and only released once. syscons has much the same extra lock sprinkling > for event handling as vt: > > - intr_event_execute_handlers() acquires Giant and calls atkbdintr() > - atkbdintr() calls sckbdevent() > - sckbdevent() unnecessarily acquires Giant again > - the buggy unlocking drops Giant just once > - shutdown_nice() is called with Giant held > - the buggy unlocking fails to re-acquire Giant > - sckbdevent() releases Giant, leaving it not held > - sckbdevent() returns > - atkbdintr() returns > - intr_event_execute_handlers() releases Giant. This should panic, but > it apparently blocks for long enough for init to shut down first. > > When I trace the last step, I get a panic which might be either from the > different timing or just a bug in kdb. Good point. I think the following change is good for everything except calling shutdown_nice() from a fast interrupt handler with noinit running: diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c index e5ea9644ad3f..564aecd811be 100644 --- a/sys/kern/kern_shutdown.c +++ b/sys/kern/kern_shutdown.c @@ -366,6 +366,12 @@ kern_reboot(int howto) { static int once = 0; + /* + * Drop Giant once and for all. + */ + while (mtx_owned(&Giant)) + mtx_unlock(&Giant); + #if defined(SMP) /* * Bind us to the first CPU so that all shutdown code runs there. Some Comments? Warner