Date: Wed, 23 Apr 2014 23:28:25 -0400 (EDT) From: Benjamin Kaduk <kaduk@MIT.EDU> To: Mikolaj Golub <trociny@freebsd.org> Cc: Stanislav Sedov <stas@freebsd.org>, freebsd-hackers@freebsd.org Subject: Re: valgrind on amd64 crashes when delivering signal for threaded application Message-ID: <alpine.GSO.1.10.1404232316020.21026@multics.mit.edu> In-Reply-To: <20140423200135.GA6009@gmail.com> References: <20140423200135.GA6009@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 23 Apr 2014, Mikolaj Golub wrote: > I am observing an issue with valgrind on amd64 CURRENT or 10, when it I cannot remember whether we changed the stack alignment on one or both of i386 and amd64 when we switched to clang; I think we did, but am having trouble finding it in the archives. Though, I think it would have been to match what clang does by default on linux, which would not really help explain the weird behavior from valgrind. > I tracked it to r249423 (import of clang 3.3), which optimizes > this statement in the signal handler wrapper from thr_sig.c: > > static void > thr_sighandler(int sig, siginfo_t *info, void *_ucp) > { > ... > struct sigaction act; > ... > act = _thr_sigact[sig-1].sigact; > > into a sequence of movups/movaps instructions: > > 0x000000000000dc2f <+79>: movups (%r14,%r15,1),%xmm0 > 0x000000000000dc34 <+84>: movups 0x10(%r14,%r15,1),%xmm1 > 0x000000000000dc3a <+90>: movaps %xmm1,-0x40(%rbp) > 0x000000000000dc3e <+94>: movaps %xmm0,-0x50(%rbp) > > I have lost in valgrind signal handling details, but apparently the > frame for thr_sighandler() is misaligned when running by valgrind and > as a result the movaps operand (the destination of act local variable) > is not aligned on a 16-byte boundary. > > The prblem may be workarounded either by compiling thr_sig.c without > optimization or replacing the assignment by bcopy(). > > Also, changing the alignment of the sigframe the valgrind pushes on > the stack when delivering a signal to 8 bytes fixes the issue: > > --- coregrind/m_sigframe/sigframe-amd64-freebsd.c.orig 2014-04-23 22:39:45.000000000 +0300 > +++ coregrind/m_sigframe/sigframe-amd64-freebsd.c 2014-04-23 22:40:23.000000000 +0300 > @@ -250,7 +250,7 @@ static Addr build_sigframe(ThreadState * > UWord err; > > rsp -= sizeof(*frame); > - rsp = VG_ROUNDDN(rsp, 16); > + rsp = VG_ROUNDDN(rsp, 16) - 8; I would expect that the fact that this patch fixes the observed crash means that valgrind has a bug when setting up the stack for the signal handler. I had to work around an apparently similar bug in the built-in lightweight thread implementation in net/openafs by forcing -mstack-realign to be used for its compilation (because analyzing the lightweight threads implementation when upstream is trying to switch to pthreads is not worth the effort). I guess here the thing to try would be compiling libthr with -mstack-realign, not that that is a reasonable thing to do in head. Perhaps the valgrind upstream should be asked about the details of the stack creation? -Ben > frame = (struct sigframe *)rsp; > > if (!extend(tst, rsp, sizeof(*frame))) > > Unfortunately, I have poor understanding of valgrind internals and > what is going on exactly when it delivers a signal to the process, so > failed to find a proper fix.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.GSO.1.10.1404232316020.21026>