Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Apr 2023 16:33:22 -0500
From:      Kyle Evans <kevans@freebsd.org>
To:        =?UTF-8?B?SmVhbi1Tw6liYXN0aWVuIFDDqWRyb24=?= <dumbbell@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Handling panics inside vt(4) callbacks
Message-ID:  <CACNAnaH_dv8ZDJV2PH6tr8UCAHi=C=Mdw=qYCgMqwoW-rwQjTQ@mail.gmail.com>
In-Reply-To: <4ed85151-09e8-db3e-0e0b-d0a8f3bb937c@FreeBSD.org>
References:  <4ed85151-09e8-db3e-0e0b-d0a8f3bb937c@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 12, 2023 at 3:45=E2=80=AFPM Jean-S=C3=A9bastien P=C3=A9dron
<dumbbell@freebsd.org> wrote:
>
> Hi!
>
> While working on the DRM drivers, I don't always get a kernel core dump
> in case of a panic.
>
> My hypothesis is that if the DRM driver code called by vt(4) panics,
> then the panic code might not go through successfully. The reason is
> because panic(9) prints the reason, a stacktrace and possibly some
> progress to the console, which calls vt(4) and the DRM driver code again.
>
> I played with the following patch:
> https://gist.github.com/dumbbell/88d77789bfeb38869268c84c40575f49
>
> The idea is that before calling "vt_flush()" in "vtterm_done()", I set a
> global flag to true to indicate that vt(4) is called as part of kdb or a
> panic. If another panic occurs inside vt_flush(), typically the
> underlying DRM driver code, "vtterm_done()" is called recursively and
> "vt_flush()" might trigger the same panic again. If the flag is set, the
> entire function is skipped instead.
>
> I test the patch by adding a panic(9) just before "vt_flush()" and I
> trigger the initial panic with debug.kdb.panic=3D1. I don't even load a
> DRM driver. My problem is that in this case, the laptop reboots
> immediately. However, if I replace panic(9) with a simple printf(9), it
> works as expected and I get a kernel dump.
>
> I could not find something in panic(9) code that would reboot the
> computer in case of a nested panic.
>
> Previous versions of the patch called doadump() and rebooted the
> computer explicitly if the flag was set, but it didn't work either and I
> thought I could simplify that patch and let panic(9) handle recursion.
> In other words, I just want to skip most of vt(4) code if vt(4) or DRM
> crash.
>
> Does someone spot something wrong in my hypothesis or methodology?
>

FWIW, I have a related patch that I maintain in my tree that I simply
haven't found time to try and upstream.  When the system panics, it
tries to switch back to ttyv0, but it calls into vd_postswitch() with
the vt lock still held. In my case, with i915kms, the vd_postswitch
implementation attempts to sleep with the lock still held and
everything goes off the rails. See below.

Thanks,

Kyle Evans

diff --git a/sys/dev/vt/vt_core.c b/sys/dev/vt/vt_core.c
index 267dd7bf2489..b57370beae7b 100644
--- a/sys/dev/vt/vt_core.c
+++ b/sys/dev/vt/vt_core.c
@@ -592,10 +592,10 @@ vt_window_switch(struct vt_window *vw)
                 * switch to console mode when panicking, making sure the p=
anic
                 * is readable (even when a GUI was using ttyv0).
                 */
+               VT_UNLOCK(vd);
                if ((kdb_active || KERNEL_PANICKED()) &&
                    vd->vd_driver->vd_postswitch)
                        vd->vd_driver->vd_postswitch(vd);
-               VT_UNLOCK(vd);
                return (0);
        }
        if (!(vw->vw_flags & (VWF_OPENED|VWF_CONSOLE))) {



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACNAnaH_dv8ZDJV2PH6tr8UCAHi=C=Mdw=qYCgMqwoW-rwQjTQ>