Date: Sun, 31 Mar 2019 15:10:15 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r345696 - head/lib/libvgl Message-ID: <20190331121015.GK1923@kib.kiev.ua> In-Reply-To: <20190331214235.K961@besplex.bde.org> References: <201903291557.x2TFv9AW097226@repo.freebsd.org> <20190329182100.GZ1923@kib.kiev.ua> <20190330142319.I1011@besplex.bde.org> <20190330094558.GA1923@kib.kiev.ua> <20190331214235.K961@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Mar 31, 2019 at 10:27:54PM +1100, Bruce Evans wrote: > On Sat, 30 Mar 2019, Konstantin Belousov wrote: > > > On Sat, Mar 30, 2019 at 03:24:40PM +1100, Bruce Evans wrote: > >> On Fri, 29 Mar 2019, Konstantin Belousov wrote: > >> > >>> On Fri, Mar 29, 2019 at 03:57:09PM +0000, Bruce Evans wrote: > >>>> Author: bde > >>>> Date: Fri Mar 29 15:57:08 2019 > >>>> New Revision: 345696 > >>>> URL: https://svnweb.freebsd.org/changeset/base/345696 > >>>> > >>>> Log: > >>>> Fix endless loops for handling SIGBUS and SIGSEGV. > >>>> > >>>> r80270 has the usual wrong fix for unsafe signal handling -- just set > >>>> a flag and return to let an event loop check the flag and do safe > >>>> handling. This never works for signals like SIGBUS and SIGSEGV that > >>>> repeat and works poorly for others unless the application has an event > >>>> loop designed to support this. > >>>> > >>>> For these signals, clean up unsafely as before, except for arranging that > >>>> nested signals are fatal and forcing a nested signal if the cleanup doesn't > >>>> cause one. > >>>> > >>>> Modified: > >>>> head/lib/libvgl/main.c > >>>> > >>>> Modified: head/lib/libvgl/main.c > >>>> ============================================================================== > >>>> --- head/lib/libvgl/main.c Fri Mar 29 15:20:48 2019 (r345695) > >>>> +++ head/lib/libvgl/main.c Fri Mar 29 15:57:08 2019 (r345696) > >>>> ... > >>>> @@ -107,14 +107,22 @@ struct vt_mode smode; > >>>> } > >>>> > >>>> static void > >>>> -VGLAbort(int arg __unused) > >>>> +VGLAbort(int arg) > >>>> { > >>>> + sigset_t mask; > >>>> + > >>>> VGLAbortPending = 1; > >>>> signal(SIGINT, SIG_IGN); > >>>> signal(SIGTERM, SIG_IGN); > >>>> - signal(SIGSEGV, SIG_IGN); > >>>> - signal(SIGBUS, SIG_IGN); > >>>> signal(SIGUSR2, SIG_IGN); > >>>> + if (arg == SIGBUS || arg == SIGSEGV) { > >>>> + signal(arg, SIG_DFL); > >>>> + sigemptyset(&mask); > >>>> + sigaddset(&mask, arg); > >>>> + sigprocmask(SIG_UNBLOCK, &mask, NULL); > >>>> + VGLEnd(); > >>>> + kill(getpid(), arg); > >>> This of course misses the siginfo information from the real fault. > >> > >> It is in the nested signal frame. > >> > >>> Why SIGBUS/SIGSEGV are caught at all ? > >> > >> Otherwise, the screen is left in an application mode that the kernel > >> doesn't support (except to not write to the screen, so that the > >> application has full control). > > Also, the keyboard may be left in a strange state. Usually this is no > worse than termios raw mode, but it may be raw scancodes. In raw scancodes > mode, Alt-Fn to switch to another vty to fix the problem. libvgl disables > switching to another vty without going through a libvgl handler anyway. > The combination is usually enough to break switching to vty0 to run ddb, > though that should work for at least breakpoints in the kernel. IIRC, > sc_cngrab() switches the keyboard mode, and my version of it restores > ignoring of the anti-switch flag. > > > ... > > I am more about not doing kill(2) from the handler, instead do plain > > return to the context which caused the original signal. > > That won't work well for SIGBUS's and SIGSEGV's related to the library. > If the above cleanup is not done, then it is too hard to debug the problem, > and if the above cleanup is done then it is impossible to debug the original > problem if it was for and invalid access in libvgl code. > > The latter is a problem for the above cleanup too -- the cleanup clobbers > the libvgl state. > > However, I could do a more limited cleanup of just restoring the screen and > keyboard state using ioctls. ioctl() isn't async-signal safe in POSIX, but > most ioctls are async-signal safe in practice provided they don't use > global variables that were set too recently or too non-atomically. > VGLEnd() uses main globals established by VGLInit(). It depends on program > order which is not guaranteed in signal handlers for testing the flags set > by initialization. The cleanup now also does: > - screen clearing. This breaks debugging and is dangerous if the bug is > in screen clearing. Screen clearing is also very slow, and belongs > in the kernel. It is now done 4 times per libvgl use, but its > slowness is limited by the kernel not clearing properly and the > kernel not allowing the application to mmap() the whole physical > frame buffer in some versions of FreeBSD (this breaks panning). > - munmap(). This breaks debugging. > - free() of backing store buffer and the main VGLDisplay data. This breaks > debugging and is unnecessary if we are going to exit. You can only try to do async-safe calls from SIGSEGV handler, to not break a limited chances to success. Also you should not modify in-memory state for the library, for the same reason, and then it fits your goal of keeping the state intact for debugging. Then you should return to re-create the original situation causing and not kill(2). I find it weird to try to debug mode-switched code without serial console.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190331121015.GK1923>