Date: Wed, 12 Apr 2023 16:33:22 -0500 From: Kyle Evans <kevans@freebsd.org> To: =?UTF-8?B?SmVhbi1Tw6liYXN0aWVuIFDDqWRyb24=?= <dumbbell@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: Handling panics inside vt(4) callbacks Message-ID: <CACNAnaH_dv8ZDJV2PH6tr8UCAHi=C=Mdw=qYCgMqwoW-rwQjTQ@mail.gmail.com> In-Reply-To: <4ed85151-09e8-db3e-0e0b-d0a8f3bb937c@FreeBSD.org> References: <4ed85151-09e8-db3e-0e0b-d0a8f3bb937c@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 12, 2023 at 3:45=E2=80=AFPM Jean-S=C3=A9bastien P=C3=A9dron <dumbbell@freebsd.org> wrote: > > Hi! > > While working on the DRM drivers, I don't always get a kernel core dump > in case of a panic. > > My hypothesis is that if the DRM driver code called by vt(4) panics, > then the panic code might not go through successfully. The reason is > because panic(9) prints the reason, a stacktrace and possibly some > progress to the console, which calls vt(4) and the DRM driver code again. > > I played with the following patch: > https://gist.github.com/dumbbell/88d77789bfeb38869268c84c40575f49 > > The idea is that before calling "vt_flush()" in "vtterm_done()", I set a > global flag to true to indicate that vt(4) is called as part of kdb or a > panic. If another panic occurs inside vt_flush(), typically the > underlying DRM driver code, "vtterm_done()" is called recursively and > "vt_flush()" might trigger the same panic again. If the flag is set, the > entire function is skipped instead. > > I test the patch by adding a panic(9) just before "vt_flush()" and I > trigger the initial panic with debug.kdb.panic=3D1. I don't even load a > DRM driver. My problem is that in this case, the laptop reboots > immediately. However, if I replace panic(9) with a simple printf(9), it > works as expected and I get a kernel dump. > > I could not find something in panic(9) code that would reboot the > computer in case of a nested panic. > > Previous versions of the patch called doadump() and rebooted the > computer explicitly if the flag was set, but it didn't work either and I > thought I could simplify that patch and let panic(9) handle recursion. > In other words, I just want to skip most of vt(4) code if vt(4) or DRM > crash. > > Does someone spot something wrong in my hypothesis or methodology? > FWIW, I have a related patch that I maintain in my tree that I simply haven't found time to try and upstream. When the system panics, it tries to switch back to ttyv0, but it calls into vd_postswitch() with the vt lock still held. In my case, with i915kms, the vd_postswitch implementation attempts to sleep with the lock still held and everything goes off the rails. See below. Thanks, Kyle Evans diff --git a/sys/dev/vt/vt_core.c b/sys/dev/vt/vt_core.c index 267dd7bf2489..b57370beae7b 100644 --- a/sys/dev/vt/vt_core.c +++ b/sys/dev/vt/vt_core.c @@ -592,10 +592,10 @@ vt_window_switch(struct vt_window *vw) * switch to console mode when panicking, making sure the p= anic * is readable (even when a GUI was using ttyv0). */ + VT_UNLOCK(vd); if ((kdb_active || KERNEL_PANICKED()) && vd->vd_driver->vd_postswitch) vd->vd_driver->vd_postswitch(vd); - VT_UNLOCK(vd); return (0); } if (!(vw->vw_flags & (VWF_OPENED|VWF_CONSOLE))) {
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACNAnaH_dv8ZDJV2PH6tr8UCAHi=C=Mdw=qYCgMqwoW-rwQjTQ>