Date: Sun, 4 Mar 2012 00:59:59 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Konstantin Belousov <kostikbel@gmail.com> Cc: svn-src-head@FreeBSD.org, Tijl Coosemans <tijl@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-all@FreeBSD.org, Bruce Evans <brde@optusnet.com.au> Subject: Re: svn commit: r232275 - in head/sys: amd64/include i386/include pc98/include x86/include Message-ID: <20120303221614.G5236@besplex.bde.org> In-Reply-To: <20120303091426.GS75778@deviant.kiev.zoral.com.ua> References: <201202282217.q1SMHrIk094780@svn.freebsd.org> <201203012347.32984.tijl@freebsd.org> <20120302132403.P929@besplex.bde.org> <201203022231.43186.tijl@freebsd.org> <20120303110551.Q1494@besplex.bde.org> <20120303091426.GS75778@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 3 Mar 2012, Konstantin Belousov wrote: > On Sat, Mar 03, 2012 at 12:02:23PM +1100, Bruce Evans wrote: >> On Fri, 2 Mar 2012, Tijl Coosemans wrote: >> >> So the interesting points for signal handlers move to: >> - should signal handlers have to initialize their own state if they want >> to use FP explicitly? I think they should. > Might be, they should if talking about abstract C implementation, > but any useful Unix (U*x, probably) implementation gives much more > guarantees there. They don't document it of course. >> - should signal handlers have to initialize their own state if they want >> to use FP or shared registers implicitly (because the compiler wants to)? >> No. The kernel must handle this transparently, much like it does now, >> and I think this makes the previous case work transparently too. The >> kernel tries to do this lazily, but it doesn't do this very well (it >> copies the state several times in sendsig() and sigreturn()). >> - when the signal handler wants to modify the interrupted state, how does >> it do this? There is minimal support for this. The easiest way to >> modify it is to modify the current state and then longjmp() instead of >> returning. > I disagree. The most correct way is to modify ucontext_t supplied to the > handler, and then return normally. There may be state grown in next > generations of architecture which signal handler author is not aware. > Also, on some architectures some parts of the ucontext/sigcontext > can only be restored by kernel. This is true even for x86. So you want an average SIGINT handler that doesn't want to do any FP, to understand the complications of FP better than longjmp() does, so as to do what longjmp() doesn't know how to do, for future arches. It is true that some parts of contexts can only be restored by the kernel. Even the signal mask requires sigreturn(2) for restoral without races. FreeBSD's [sig]longjmp() doesn't know how to do this, and neither does an average user of [sig]longjmp(). Returning from signal handlers works better because it uses sigreturn() automatically. However, it is difficult to _modify_ delicate FP or other state that you don't understand in a signal handler so that sigreturn() restores what you want. It is easiest to prepare the state before calling setjmp() and have longjmp() simply restore it. The normal preparation is to do nothing -- the program knows nothing of FP, and is happily running with a usable FP state. setjmp() simply saves this state, and longjmp() should restore it (except for exception flags). Note that it doesn't work to require the program to fix up the state after setjmp() returns 1, since the program knows nothing of FP so it won't know how to fix up the state then any more that it knows how to fix up the state before longjmp(), although the fixup is much easier. Apart from no program knowing that it should be done, something like the following would work: for _every_ call to setjmp: /* Save FP env, because some setjmp()s are too broken to do it. */ fegetenv(&env); if (setjmp(jb) != 0) { /* Restore FP env, since some longjmp()s are too broken... */ * * But first, if we are actually an FP program that wants * to use fenv, then try to recover the current exception * flags. Most longjmp()s from signal handlers lose these, * but* this is harder to fix so we just hope that we don't * have to. */ fegetexceptflags(&ex, FE_ALL_EXCEPT) fesetenv(&env); fesetexceptflags(&ex, FE_ALL_EXCEPT) /* * XXX what about raising any exceptions that we just * unmasked? * * In a signal handler (before longjmp() the code to * not lose the exception flags (assuming that the signal * handler is passed a clean state) would be something like: * - use fenv to mask all exceptions * - read the exception flags from uncontext_t. The MI * API fegetexceptflags() is usually unavailable for this * - store the exception flags into the hardware. Since * the MI API fesetexceptflags() is not available either * this seems easier than converting the harware * representation that is probably in ucontext_t into an * fexcept_t. We masked all exceptions so that new ones * don't bite us. * - we can now use longjmp(), provided longjmp() doesn't * change any FP state and the caller of setjmp() has * the above complications to replace the rest of the * signal handler's unknown FP state with a good one, * including unmasking any exceptions that we masked. */ } The above complications belong in setjmp() and longjmp(), not in every program. setjmp() and longjmp() can do them much more efficiently. For example, on and64 it is not necessary to save the full FP environment (which is a very slow operation). >> - how can signal handlers and debuggers even see the interrupted state? >> gdb has less clue about this than it did 20 years ago. Users can >> probably use debuggers to follow various pointers to the saved state >> if they know more about this than signal handlers and debuggers. > Signal handlers should examine ucontext_t. > > ptrace(2) interface on FreeBSD allows to fully examine and modify the > thread CPU state. gdb indeed was not upgraded to be aware of recent > FreeBSD features (and not very recent features, too). Yes, it is difficult. I didn't even mention portability before. For standard C, there is no ucontext_t. For POSIX, ucontext_t is essentially opaque. You can save and restore it but you can't modify it without doing unportable things. But FP changes require doing very unportable OS- and CPU- dependent things. Depending on longjmp() to work right is of course very OS-dependent, but longjmp() can very easily handle some CPU- dependent things provided it has non-broken semantics. >>> If longjmp is not supposed to change the FP env then, when called from >>> a signal handler, either the signal handler must install a proper FP >>> env before calling longjmp or a proper FP env must be installed after >>> the target setjmp call. Otherwise the FP env is unspecified. >> >> Better handle the usual case right like it used to be, without the >> signal handler having to do anything, by always saving a minimal >> environment in setjmp(), but now only restoring it for longjmp() in >> signal handlers. The minimal environment doesn't include any normal >> register on at least amd64 and i386 (except for i387 it includes the >> stack and the tags -- these must be empty on return from any function >> call). >> >> Again there is a problem with transparent use of FP or SSE by the >> compiler. An average SIGINT handler that doesn't want to do any >> explicit FP and just wants to longjmp() back to the main loop can't >> be expected to understand this stuff better than standards, kernels >> and compilers and have the complications neccessary to fix up the FP >> state after the compiler has transparently (it thinks) used FP or SSE. > > longjmp() from a signal handler has very high chance of providing > wrong CPU state for anything except basic integer registers. Only if longjmp() it is broken. A slightly different way to look at this is that without fenv support. restoring the entire FP state (or all of it that matters) to that the setjmp() works perfectly, because conforming programs just can't see any fenv state. FP exception flags correspond to the overflow flag in integer arithmetic, and average programs know nothing of either. Support for fenv must not be allowed to break this. Here is my old program for testing that some of this works on i386. It has rotted a bit (last edit 25 Oct 1994). It assumes that the divison by 0 exception and the invalid operand exception are unmasked, as in FreeBSD-[1-~2]. % #undef TEST_CW_PRESERVED_ACROSS_SIGFPE % #define TEST_LONGJMP_RESTORES_FP % % #define _POSIX_SOURCE 1 % % #include <setjmp.h> % #include <signal.h> % #include <unistd.h> % % static sigjmp_buf sjb; % % static void catch(int sig) % { % write(1, "1", 1); % siglongjmp(sjb, 1); % } % % int main(void) % { % struct sigaction action; % % action.sa_handler = catch; % sigemptyset(&action.sa_mask); % action.sa_flags = 0; % #ifdef TEST_CW_PRESERVED_ACROSS_SIGFPE % sigaction(SIGFPE, &action, (struct sigaction *) NULL); % #endif % #ifdef TEST_LONGJMP_RESTORES_FP % sigaction(SIGINT, &action, (struct sigaction *) NULL); % #endif % % while (1) % { % if (sigsetjmp(sjb, 1)) % write(1, "2", 1); % else % { % #ifdef TEST_CW_PRESERVED_ACROSS_SIGFPE % __asm("fldz; fld1; fdiv %st(1),%st; fwait"); % write(1, "?", 1); % #endif % #ifdef TEST_LONGJMP_RESTORES_FP % while (1) % __asm("fld1; fstp %st"); % #endif % } % } % } When TEST_LONGJMP_RESTORES_FP is defined, this tests that the FP stack doesn't become corrupted by SIGINTs (the corrupt stack should give a SIGFPE which is not caught, else "12" should be printed after every SIGINT. When TEST_CW_PRESERVED_ACROSS_SIGFPE is defined, this tests that the the divison by 0 exception is unmasked and remains unmasked after longjmp() from a signal handler for the SIGFPE. Now it also needs to check that the exception bit for division by zero is not lost by any of - handling an unmasked SIGFPE or a SIGINT, with and without longjmp()ing from the handler - looping without handling any signal The exception bit is lost in most cases. Similarly for SSE and mxcsr. Similarly for other arches. Division by 0 is fairly easy to arrange without using asm, but the above uses asm so that it can control the amount of FP used, and its placement. The asms probably now need to be volatile to prevent them moving. Or just compile with -O0. I have a much larger test that unmasked FP exceptions (mainly for division by 0) work correctly with i486 and later with exception 16 and as well as possible for i386/i387 with IRQ13. -current still passes the former, but this is due to a bug in the test. The test assumes that the exception flags are clobbered by SIGFPE handling,s so that when the SIGFPE handler returns normally, the SIGFPE doesn't repeat. i387 FP exceptions aren't quite normal faults since the signal trap is delayed until the next non-control FP instruction after the one that caused the exception, especially with IRQ13, but they behave similarly unless the exception flags are clobberred (the fault repeats on the next non-control FP instruction). SSE FP exceptions are normal faults (the fault repeats on the instruction that causes it, and the exception flags have no effect on this, and FreeBSD's SIGFPE handler doesn't clobber them anyway). FP exceptions are rarely unmasked now, so the buggy behaviour is mostly moot now even for i387. Anyway, no one should expect to continue after a SIGFPE handler returns normally without fixing up the problem completely. That's much more harmful than longjmp()ing from the handler (on i387, the stack is certain to be corrupt, and the exception flags are clobbered to hide some problems). longjmp()ing from SIGFPE handlers needs to work as well as longjmp() from SIGINT handlers to provide a reasonably easy way out of them. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120303221614.G5236>