From owner-svn-src-stable@freebsd.org Wed Mar 14 07:16:30 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57EBCF5B26B; Wed, 14 Mar 2018 07:16:30 +0000 (UTC) (envelope-from eadler@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0615B714EC; Wed, 14 Mar 2018 07:16:30 +0000 (UTC) (envelope-from eadler@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 00E7024D76; Wed, 14 Mar 2018 07:16:30 +0000 (UTC) (envelope-from eadler@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w2E7GTwO057169; Wed, 14 Mar 2018 07:16:29 GMT (envelope-from eadler@FreeBSD.org) Received: (from eadler@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w2E7GT9T057167; Wed, 14 Mar 2018 07:16:29 GMT (envelope-from eadler@FreeBSD.org) Message-Id: <201803140716.w2E7GT9T057167@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: eadler set sender to eadler@FreeBSD.org using -f From: Eitan Adler Date: Wed, 14 Mar 2018 07:16:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r330912 - stable/11/sys/dev/syscons X-SVN-Group: stable-11 X-SVN-Commit-Author: eadler X-SVN-Commit-Paths: stable/11/sys/dev/syscons X-SVN-Commit-Revision: 330912 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Mar 2018 07:16:30 -0000 Author: eadler Date: Wed Mar 14 07:16:29 2018 New Revision: 330912 URL: https://svnweb.freebsd.org/changeset/base/330912 Log: MFC r305121,r305231: Add some locking to sc_cngetc(). Keyboard input needs Giant locking, and that is not possible to do correctly here. Use mtx_trylock() and proceed unlocked as before if we can't acquire Giant (non-recursively), except in kdb mode don't even try to acquire Giant. Everything here is a hack, but it often works. Even if mtx_trylock() succeeds, this might be a LOR. Keyboard input also needs screen locking, to handle screen updates and switches. Add this, using the same simplistic screen locking as for sc_cnputc(). Giant must be acquired before the screen lock, and the screen lock must be dropped when calling the keyboard driver (else it would get a harmless LOR if it tries to acquire Giant). It was intended that sc cn open/close hide the locking calls, and they do for i/o functions functions except for this complication. Non-console keyboard input is still only Giant-locked, with screen locking in some called functions. This is correct for the keyboard parts only. When Giant cannot be acquired properly, atkbd and kbdmux tend to race and work (they assume that the caller acquired Giant properly and don't try to acquire it again or check that it has been acquired, and the races rarely matter), while ukbd tends to deadlock or panic (since it does the opposite, and has other usb threads to deadlock with). The keyboard (Giant) locking here does very little, but the screen locking completes screen locking for console mode except for not detecting or handling deadlock. The log message for the previous commit didn't mention the most the important detail that sc_cngetc() now opens and closes the keyboard on every call again. This was moved from sc_cngetc() to scn_cngrab/ ungrab() in r228644, but the change wasn't quite complete. After fixes for nesting in kbdd_poll() in ukbd and kbdmux, these opens and closes should have no significant effect if done while grabbed. They fix unusual cases when cngetc() is called while not grabbed. This commit is the main fix for screen locking in sc_cnputc(): detect deadlock or likely-deadlock and handle it by buffering the output atomically and printing it later if the deadlock condition clears (and sc_cnputc() is called). The most common deadlock is when the screen lock is held by ourself. Then it would be safe to acquire the lock recursively if the console driver is calling printf() in a safe context, but we don't know when that is. It is not safe to ignore the lock even in kdb or panic mode. But ignore it in panic mode. The only other known case of deadlock is when another thread holds the lock but is running on a stopped CPU. Detect that case approximately by using trylock and retrying for 1000 usec. On a 4 GHz CPU, 100 usec is almost long enough -- screen switches take slightly longer than that. Not retrying at all is good enough except for stress tests, and planned future versions will extend the timeout so that the stress tests work better. To see the behaviour when deadlock is detected, single step through sctty_outwakeup() (or sc_puts() to start with deadlock). Another (serial) console is needed to the buffered-only output, but the keyboard works in this context to continue or step out of the deadlocked region. The buffer is not large enough to hold all the output for this. Modified: stable/11/sys/dev/syscons/syscons.c stable/11/sys/dev/syscons/syscons.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/syscons/syscons.c ============================================================================== --- stable/11/sys/dev/syscons/syscons.c Wed Mar 14 07:11:33 2018 (r330911) +++ stable/11/sys/dev/syscons/syscons.c Wed Mar 14 07:16:29 2018 (r330912) @@ -1651,20 +1651,68 @@ sc_cnterm(struct consdev *cp) static void sccnclose(sc_softc_t *sc, struct sc_cnstate *sp); static int sc_cngetc_locked(struct sc_cnstate *sp); +static void sccnkbdlock(sc_softc_t *sc, struct sc_cnstate *sp); +static void sccnkbdunlock(sc_softc_t *sc, struct sc_cnstate *sp); static void sccnopen(sc_softc_t *sc, struct sc_cnstate *sp, int flags); static void sccnscrlock(sc_softc_t *sc, struct sc_cnstate *sp); static void sccnscrunlock(sc_softc_t *sc, struct sc_cnstate *sp); static void +sccnkbdlock(sc_softc_t *sc, struct sc_cnstate *sp) +{ + /* + * Locking method: hope for the best. + * The keyboard is supposed to be Giant locked. We can't handle that + * in general. The kdb_active case here is not safe, and we will + * proceed without the lock in all cases. + */ + sp->kbd_locked = !kdb_active && mtx_trylock(&Giant); +} + +static void +sccnkbdunlock(sc_softc_t *sc, struct sc_cnstate *sp) +{ + if (sp->kbd_locked) + mtx_unlock(&Giant); + sp->kbd_locked = FALSE; +} + +static void sccnscrlock(sc_softc_t *sc, struct sc_cnstate *sp) { - SC_VIDEO_LOCK(sc); + int retries; + + /** + * Locking method: + * - if kdb_active and video_mtx is not owned by anyone, then lock + * by kdb remaining active + * - if !kdb_active, try to acquire video_mtx without blocking or + * recursing; if we get it then it works normally. + * Note that video_mtx is especially unusable if we already own it, + * since then it is protecting something and syscons is not reentrant + * enough to ignore the protection even in the kdb_active case. + */ + if (kdb_active) { + sp->kdb_locked = sc->video_mtx.mtx_lock == MTX_UNOWNED || panicstr; + sp->mtx_locked = FALSE; + } else { + sp->kdb_locked = FALSE; + for (retries = 0; retries < 1000; retries++) { + sp->mtx_locked = mtx_trylock_spin_flags(&sc->video_mtx, + MTX_QUIET) != 0 || panicstr; + if (sp->mtx_locked) + break; + DELAY(1); + } + } } static void sccnscrunlock(sc_softc_t *sc, struct sc_cnstate *sp) { - SC_VIDEO_UNLOCK(sc); + if (sp->mtx_locked) + mtx_unlock_spin(&sc->video_mtx); + sp->mtx_locked = sp->kdb_locked = FALSE; } static void @@ -1676,11 +1724,14 @@ sccnopen(sc_softc_t *sc, struct sc_cnstate *sp, int fl sp->kbd_opened = FALSE; sp->scr_opened = FALSE; + sp->kbd_locked = FALSE; /* Opening the keyboard is optional. */ if (!(flags & 1) || sc->kbd == NULL) goto over_keyboard; + sccnkbdlock(sc, sp); + /* * Make sure the keyboard is accessible even when the kbd device * driver is disabled. @@ -1698,6 +1749,8 @@ over_keyboard: ; /* The screen is opened iff locking it succeeds. */ sccnscrlock(sc, sp); + if (!sp->kdb_locked && !sp->mtx_locked) + return; sp->scr_opened = TRUE; /* The screen switch is optional. */ @@ -1728,6 +1781,7 @@ sccnclose(sc_softc_t *sc, struct sc_cnstate *sp) kbdd_disable(sc->kbd); sp->kbd_opened = FALSE; + sccnkbdunlock(sc, sp); } /* @@ -1753,6 +1807,7 @@ sc_cngrab(struct consdev *cp) if (lev >= 0 && lev < 2) { sccnopen(sc, &sc->grab_state[lev], 1 | 2); sccnscrunlock(sc, &sc->grab_state[lev]); + sccnkbdunlock(sc, &sc->grab_state[lev]); } } @@ -1765,12 +1820,17 @@ sc_cnungrab(struct consdev *cp) sc = sc_console->sc; lev = atomic_load_acq_int(&sc->grab_level) - 1; if (lev >= 0 && lev < 2) { + sccnkbdlock(sc, &sc->grab_state[lev]); sccnscrlock(sc, &sc->grab_state[lev]); sccnclose(sc, &sc->grab_state[lev]); } atomic_add_int(&sc->grab_level, -1); } +static char sc_cnputc_log[0x1000]; +static u_int sc_cnputc_loghead; +static u_int sc_cnputc_logtail; + static void sc_cnputc(struct consdev *cd, int c) { @@ -1782,12 +1842,28 @@ sc_cnputc(struct consdev *cd, int c) struct tty *tp; #endif #endif /* !SC_NO_HISTORY */ + u_int head; int s; /* assert(sc_console != NULL) */ sccnopen(scp->sc, &st, 0); + /* + * Log the output. + * + * In the unlocked case, the logging is intentionally only + * perfectly atomic for the indexes. + */ + head = atomic_fetchadd_int(&sc_cnputc_loghead, 1); + sc_cnputc_log[head % sizeof(sc_cnputc_log)] = c; + + /* + * If we couldn't open, return to defer output. + */ + if (!st.scr_opened) + return; + #ifndef SC_NO_HISTORY if (scp == scp->sc->cur_scp && scp->status & SLKED) { scp->status &= ~SLKED; @@ -1815,8 +1891,14 @@ sc_cnputc(struct consdev *cd, int c) } #endif /* !SC_NO_HISTORY */ - buf[0] = c; - sc_puts(scp, buf, 1, 1); + /* Play any output still in the log (our char may already be done). */ + while (sc_cnputc_logtail != atomic_load_acq_int(&sc_cnputc_loghead)) { + buf[0] = sc_cnputc_log[sc_cnputc_logtail++ % sizeof(sc_cnputc_log)]; + if (atomic_load_acq_int(&sc_cnputc_loghead) - sc_cnputc_logtail >= + sizeof(sc_cnputc_log)) + continue; + sc_puts(scp, buf, 1, 1); + } s = spltty(); /* block sckbdevent and scrn_timer */ sccnupdate(scp); @@ -1827,16 +1909,20 @@ sc_cnputc(struct consdev *cd, int c) static int sc_cngetc(struct consdev *cd) { + struct sc_cnstate st; int c, s; /* assert(sc_console != NULL) */ + sccnopen(sc_console->sc, &st, 1); s = spltty(); /* block sckbdevent and scrn_timer while we poll */ - if (sc_console->sc->kbd == NULL) { + if (!st.kbd_opened) { splx(s); - return -1; + sccnclose(sc_console->sc, &st); + return -1; /* means no keyboard since we fudged the locking */ } - c = sc_cngetc_locked(NULL); + c = sc_cngetc_locked(&st); splx(s); + sccnclose(sc_console->sc, &st); return c; } @@ -1853,14 +1939,16 @@ sc_cngetc_locked(struct sc_cnstate *sp) * Stop the screen saver and update the screen if necessary. * What if we have been running in the screen saver code... XXX */ - sc_touch_scrn_saver(); + if (sp->scr_opened) + sc_touch_scrn_saver(); scp = sc_console->sc->cur_scp; /* XXX */ - sccnupdate(scp); + if (sp->scr_opened) + sccnupdate(scp); if (fkeycp < fkey.len) return fkey.str[fkeycp++]; - c = scgetc(scp->sc, SCGETC_CN | SCGETC_NONBLOCK, NULL); + c = scgetc(scp->sc, SCGETC_CN | SCGETC_NONBLOCK, sp); switch (KEYFLAGS(c)) { case 0: /* normal char */ @@ -3466,7 +3554,11 @@ next_code: scp = sc->cur_scp; /* first see if there is something in the keyboard port */ for (;;) { + if (flags & SCGETC_CN) + sccnscrunlock(sc, sp); c = kbdd_read_char(sc->kbd, !(flags & SCGETC_NONBLOCK)); + if (flags & SCGETC_CN) + sccnscrlock(sc, sp); if (c == ERRKEY) { if (!(flags & SCGETC_CN)) sc_bell(scp, bios_value.bell_pitch, BELL_DURATION); Modified: stable/11/sys/dev/syscons/syscons.h ============================================================================== --- stable/11/sys/dev/syscons/syscons.h Wed Mar 14 07:11:33 2018 (r330911) +++ stable/11/sys/dev/syscons/syscons.h Wed Mar 14 07:16:29 2018 (r330912) @@ -192,6 +192,9 @@ struct scr_stat; struct tty; struct sc_cnstate { + u_char kbd_locked; + u_char kdb_locked; + u_char mtx_locked; u_char kbd_opened; u_char scr_opened; };