From owner-freebsd-hackers@freebsd.org Fri Oct 27 09:33:21 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 62BDDE3E469 for ; Fri, 27 Oct 2017 09:33:21 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F412B8266E for ; Fri, 27 Oct 2017 09:33:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v9R9XBZp036426 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 27 Oct 2017 12:33:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v9R9XBZp036426 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v9R9XBMD036425; Fri, 27 Oct 2017 12:33:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 27 Oct 2017 12:33:11 +0300 From: Konstantin Belousov To: Norbert Koch Cc: freebsd-hackers@freebsd.org Subject: Re: crerating coredump of multithreaded process Message-ID: <20171027093311.GF2566@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Oct 2017 09:33:21 -0000 On Fri, Oct 27, 2017 at 10:44:41AM +0200, Norbert Koch wrote: > Hello. > > When trying to create the coredump of a running > process (without killing it) under FreeBSD 10.3 > I am seeing a somewhat strange behaviour. Try this on HEAD or stable/11. There were a lot of changes and bugfixes in ptrace(2). I do not claim that the behaviour you see has changed, but 10.3 is too diverged from the code where developers would be willing to look at. > > As I want to see the state of all threads, the q&d way > of fork() + SIGABRT does not work for me. > > So, what I do is having a supervisor program waiting for SIGUSR1. > When my application signals the wish to be coredumped > it sends SIGSTOP to itself immediately after sending SIGUSR1. > The supervisor then forks gcore. > > From what I can see using top, my application immediately starts > again as if SIGCONT has been received while gcore hangs in wait. SIGCONT cannot be blocked, otherwise programs could create unkillable processes. > > Gcore calls ptrace(PT_ATTACH) followed by waitpid(). > So I assume that the ptrace call restarts my application > and waitpid hangs (why?). > > If I manually send SIGCONT to my stopped application > immediately before exec-ing gcore, the coredump is being > created, but for obvious reasons not as consistent as > I want it to be. > > I should add that in my application most other signals are > blocked. Blocking (or not) SIGCONT seems to have no effect. > > Am I doing something wrong here? If yes, ist there > a different/better/more elegant way of creating a consistent coredump? What is the purpose of sending SIGSTOP to itself ? Practically, it is no different than the action of ptrace(PT_ATTACH): all threads are parked at some safe place in the kernel, or are forcibly moved into the kernel mode by sending IPI if executing in userspace on other cores. To get into the safe place in kernel, threads often need to execute some more. IPI delivery is also not guaranteed to occur in the deterministic place ("at next instruction boundary"), it happens as hardware reacts to it. As you see, the process is very asynchronous, it cannot guarantee that the final snapshot is consistent with arbitrary thread state at the point of request, but it does represent the valid process state assuming that the thread are executing async. More, ptrace(PT_ATTACH) currently operates not only by a mechanism to similar to SIGSTOP, it really sends SIGSTOP to the debuggee. We do not track nested SIGSTOPs, process is either stopped or runnable. So I am not surprised that attaching to stopped process do not occur until the stopped state established earlier passes away: the debugger waits for the confirmation from all threads that they are parked at safe place, but there is no because the threads are already stopped. If threads are made runnable the acks are sent and the attach completes. I am explaining this to point out that trying to send SIGSTOP and then attaching with ptrace(PT_ATTACH) is just worse than doing ptrace(PT_ATTACH). I think you need to have supervisor either directly execute gcore(1) without SIGSTOP, or execute ptrace(PT_ATTACH) instead of kill(SIGSTOP), and have gcore functionality embedded into the it. The consistency of the generated core is actually same.