Date: Mon, 30 Aug 2010 22:01:02 -0500 From: Dan McNulty <dkmcnulty@gmail.com> To: FreeBSD-gnats-submit@FreeBSD.org Cc: Dan McNulty <dkmcnulty@gmail.com> Subject: kern/150139: [patch] signal sent to stopped, traced process not immediately handled on continue Message-ID: <4c7c7072.9f3ae70a.772a.ffffe094@mx.google.com> Resent-Message-ID: <201008310310.o7V3A2nu061081@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 150139 >Category: kern >Synopsis: [patch] signal sent to stopped, traced process not immediately handled on continue >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Aug 31 03:10:02 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Dan McNulty >Release: FreeBSD 7.2-RELEASE i386 >Organization: >Environment: FreeBSD 7.2-RELEASE FreeBSD 9.0-CURRENT (updated on 2010-08-28) >Description: In a specific test case, a signal sent to a stopped, traced process is not immediately handled when the process is continued (i.e., the debugger does not know about the pending signal and the process is not stopped by the signal). Here is the scenario: A debugger is attached to a multithreaded process, where some threads are blocked in the kernel (e.g., on a mutex) 1) The debugger installs a breakpoint into the multithreaded process and continues the process 2) A single thread hits the breakpoint and the process is stopped. 3) While the process is stopped, a signal is delivered by some means to one of the threads that is blocked in the kernel 4) After dealing with the breakpoint, the debugger continues the process and doesn't learn about the new signal delivered to the blocked thread until it is no longer blocked in the kernel. >How-To-Repeat: The problem can be reproduced using the attached waitthread and tkill programs and gdb. The waitthread creates the specified number of threads and has them wait on mutex until it receives a character on a named pipe. The tkill program is used to send signals to threads using the SYS_thr_kill2 syscall. The following commands can be used to reproduce the problem (assuming waitthread and tkill have already been built). % gdb ./waitthread (gdb) set args 8 (gdb) run Starting program: /usr/home/mcnulty/scratchdev/i386-unknown-freebsd7.2/waitthread 8 [New LWP 100073] [New Thread 28404140 (LWP 100073)] 1149 [New Thread 28404280 (LWP 100054)] 100054 waiting on lock [New Thread 284043c0 (LWP 100075)] 100075 waiting on lock [New Thread 28404500 (LWP 100076)] 100076 waiting on lock [New Thread 28404640 (LWP 100082)] 100082 waiting on lock [New Thread 28404780 (LWP 100086)] 100086 waiting on lock [New Thread 284048c0 (LWP 100087)] 100087 waiting on lock [New Thread 28404a00 (LWP 100088)] 100088 waiting on lock [New Thread 28404b40 (LWP 100089)] 100089 waiting on lock # at another prompt, send signals to different threads % procstat -t 1149 PID TID COMM TDNAME CPU PRI STATE WCHAN 1149 100054 waitthread - 0 160 sleep umtxn 1149 100073 waitthread initial thread 0 160 sleep fifoor 1149 100075 waitthread - 0 160 sleep umtxn 1149 100076 waitthread - 0 160 sleep umtxn 1149 100082 waitthread - 0 160 sleep umtxn 1149 100086 waitthread - 0 160 sleep umtxn 1149 100087 waitthread - 0 160 sleep umtxn 1149 100088 waitthread - 0 160 sleep umtxn 1149 100089 waitthread - 0 160 sleep umtxn # simulate the breakpoint % ./tkill 1149 100054 5 # simulate the new signal, sent to a blocked thread % ./tkill 1149 100086 2 # gdb should know about the trap signal Program received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 28404280 (LWP 100054)] 0x2809f475 in __error () from /lib/libthr.so.3 (gdb) c Continuing. # after continuing the process, the process never stops because of the pending SIGINT >Fix: The attached patch ( to CURRENT ) takes a stab at fixing the problem. The patch adds code so that even if the process is being traced, a signal sent to a interruptibly sleeping thread wakes up the thread so it will handle the signal when the process leaves the stopped state. The added code was copied from the code that handled stopped, non-traced processes later in the same function. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4c7c7072.9f3ae70a.772a.ffffe094>