Date: Wed, 23 Apr 2014 23:01:36 +0300 From: Mikolaj Golub <trociny@FreeBSD.org> To: freebsd-hackers@freebsd.org Cc: Stanislav Sedov <stas@FreeBSD.org> Subject: valgrind on amd64 crashes when delivering signal for threaded application Message-ID: <20140423200135.GA6009@gmail.com>
next in thread | raw e-mail | index | archive | help
I am observing an issue with valgrind on amd64 CURRENT or 10, when it
crashes the application delivering an asynchronous signal, if the
application is linked with libthr.
This simple test illustrate the issue.
  #include <sys/param.h>
  
  #include <signal.h>
  #include <unistd.h>
  
  static void
  dummy_sighandler(int sig)
  {
  	/* EMPTY */
  }
  
  int
  main()
  {
  	int c = 10;
  
  	if (signal(SIGINT, dummy_sighandler) == SIG_ERR)
  		return (1);
  	sleep(100);
  	return (0);
  }
  
Building with -lpthread, running under valgrind and pressing Ctr-C
makes it crash:
  kopusha:~/freebsd/valgrind/test_sa% valgrind --trace-signals=yes ./test_sa 
  ==55627== Memcheck, a memory error detector
  ==55627== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
  ==55627== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
  ==55627== Command: ./test_sa
  ==55627== 
  --55627-- Max kernel-supported signal is 128
  --55627-- sync signal handler: signal=11, si_code=1, EIP=0x23822, eip=0x4012e99a8, from kernel
  --55627-- SIGSEGV: si_code=1 faultaddr=0x7feffef08 tid=1 ESP=0x7feffef08 seg=0x7fe001000-0x7feffefff
  --55627--        -> extended stack base to 0x7feffe000
  --55627-- do_setmask: tid = 1 how = 1 (SIG_BLOCK), newset = 0x22C4F8 (fffffffffffffffffffffffffffff107)
  --55627--       oldset=0x7FEFFFC60 00000000000000000000000000000000
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x22C50C (00000000000000000000000000000000)
  --55627-- do_setmask: tid = 1 how = 1 (SIG_BLOCK), newset = 0x22C4F8 (fffffffffffffffffffffffffffff107)
  --55627--       oldset=0x7FEFFF7F0 00000000000000000000000000000000
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x22C50C (00000000000000000000000000000000)
  --55627-- do_setmask: tid = 1 how = 1 (SIG_BLOCK), newset = 0x22C4F8 (fffffffffffffffffffffffffffff107)
  --55627--       oldset=0x7FEFFF7F0 00000000000000000000000000000000
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x22C50C (00000000000000000000000000000000)
  --55627-- do_setmask: tid = 1 how = 1 (SIG_BLOCK), newset = 0x22C4F8 (fffffffffffffffffffffffffffff107)
  --55627--       oldset=0x7FEFFF7F0 00000000000000000000000000000000
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x22C50C (00000000000000000000000000000000)
  --55627-- sys_sigaction: sigNo 32, new 0x7fefff7b8, old 0x0, new flags 0x40
  --55627-- do_setmask: tid = 1 how = 2 (SIG_UNBLOCK), newset = 0x7FEFFF7C4 (00000000000000000000000080000000)
  --55627-- do_setmask: tid = 1 how = 1 (SIG_BLOCK), newset = 0x1220D18 (ffffffffffffffffffffffffffffffff)
  --55627--       oldset=0x1C00128 00000000000000000000000000000000
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x1C00128 (00000000000000000000000000000000)
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x1220D18 (ffffffffffffffffffffffffffffffff)
  --55627--       oldset=0x7FF0005B0 00000000000000000000000000000000
  --55627-- sys_sigaction: sigNo 2, new 0x7ff000600, old 0x7ff0005e0, new flags 0x42
  --55627-- do_setmask: tid = 1 how = 3 (SIG_SETMASK), newset = 0x7FF0005B0 (00000000000000000000000000000000)
  ^C--55627-- async signal handler: signal=2, tid=1, si_code=65542
  --55627-- interrupted_syscall: tid=1, ip=0x380ca816, restart=True, sres.isErr=True, sres.val=4
  --55627--   completed, but uncommitted: committing
  --55627-- delivering signal 2 (SIGINT):65542 to thread 1
  --55627-- push_signal_frame (thread 1): signal 2
  ==55627==    at 0x1541A4A: nanosleep (nanosleep.S:3)
  ==55627==    by 0x1492B29: sleep (sleep.c:58)
  ==55627==    by 0x1217C12: sleep (thr_syscalls.c:614)
  ==55627==    by 0x4007D7: main (test_sa.c:19)
  --55627-- sys_sigaction: sigNo 11, new 0x4012c3e78, old 0x0, new flags 0x0
  --55627-- delivering signal 11 (SIGSEGV):128 to thread 1
  --55627-- delivering 11 (code 128) to default handler; action: terminate+core
  ==55627== 
  ==55627== Process terminating with default action of signal 11 (SIGSEGV): dumping core
  ==55627==  General Protection Fault
  ==55627==    at 0x1219F3C: ??? (thr_sig.c:162)
  ==55627==    by 0x380529C7: ??? (m_trampoline.S:713)
  ==55627==    by 0x1217C12: sleep (thr_syscalls.c:614)
  ==55627==    by 0x4007D7: main (test_sa.c:19)
  ==55627== 
  ==55627== HEAP SUMMARY:
  ==55627==     in use at exit: 1,080 bytes in 2 blocks
  ==55627==   total heap usage: 2 allocs, 0 frees, 1,080 bytes allocated
  ==55627== 
  ==55627== LEAK SUMMARY:
  ==55627==    definitely lost: 0 bytes in 0 blocks
  ==55627==    indirectly lost: 0 bytes in 0 blocks
  ==55627==      possibly lost: 0 bytes in 0 blocks
  ==55627==    still reachable: 1,080 bytes in 2 blocks
  ==55627==         suppressed: 0 bytes in 0 blocks
  ==55627== Rerun with --leak-check=full to see details of leaked memory
  ==55627== 
  ==55627== For counts of detected and suppressed errors, rerun with: -v
  ==55627== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
  zsh: segmentation fault  valgrind --trace-signals=yes ./test_sa
I tracked it to r249423 (import of clang 3.3), which optimizes 
this statement in the signal handler wrapper from thr_sig.c:
  static void
  thr_sighandler(int sig, siginfo_t *info, void *_ucp)
  {
  	...
  	struct sigaction act;
  	...
  	act = _thr_sigact[sig-1].sigact;
into a sequence of movups/movaps instructions:
   0x000000000000dc2f <+79>:    movups (%r14,%r15,1),%xmm0
   0x000000000000dc34 <+84>:    movups 0x10(%r14,%r15,1),%xmm1
   0x000000000000dc3a <+90>:    movaps %xmm1,-0x40(%rbp)
   0x000000000000dc3e <+94>:    movaps %xmm0,-0x50(%rbp)
I have lost in valgrind signal handling details, but apparently the
frame for thr_sighandler() is misaligned when running by valgrind and
as a result the movaps operand (the destination of act local variable)
is not aligned on a 16-byte boundary.
The prblem may be workarounded either by compiling thr_sig.c without
optimization or replacing the assignment by bcopy().
Also, changing the alignment of the sigframe the valgrind pushes on
the stack when delivering a signal to 8 bytes fixes the issue:
  --- coregrind/m_sigframe/sigframe-amd64-freebsd.c.orig  2014-04-23 22:39:45.000000000 +0300
  +++ coregrind/m_sigframe/sigframe-amd64-freebsd.c       2014-04-23 22:40:23.000000000 +0300
  @@ -250,7 +250,7 @@ static Addr build_sigframe(ThreadState *
      UWord err;
   
      rsp -= sizeof(*frame);
  -   rsp = VG_ROUNDDN(rsp, 16);
  +   rsp = VG_ROUNDDN(rsp, 16) - 8;
      frame = (struct sigframe *)rsp;
   
      if (!extend(tst, rsp, sizeof(*frame)))
  
Unfortunately, I have poor understanding of valgrind internals and
what is going on exactly when it delivers a signal to the process, so
failed to find a proper fix.
-- 
Mikolaj Golub
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140423200135.GA6009>
