From owner-freebsd-hackers@freebsd.org  Fri Oct 27 09:33:21 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 62BDDE3E469
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Fri, 27 Oct 2017 09:33:21 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id F412B8266E
 for <freebsd-hackers@freebsd.org>; Fri, 27 Oct 2017 09:33:20 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v9R9XBZp036426
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Fri, 27 Oct 2017 12:33:11 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v9R9XBZp036426
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id v9R9XBMD036425;
 Fri, 27 Oct 2017 12:33:11 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Fri, 27 Oct 2017 12:33:11 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Norbert Koch <nkoch@demig.de>
Cc: freebsd-hackers@freebsd.org
Subject: Re: crerating coredump of multithreaded process
Message-ID: <20171027093311.GF2566@kib.kiev.ua>
References: <e455d19c-72ac-3501-8764-415c4d154c74@demig.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <e455d19c-72ac-3501-8764-415c4d154c74@demig.de>
User-Agent: Mutt/1.9.1 (2017-09-22)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Oct 2017 09:33:21 -0000

On Fri, Oct 27, 2017 at 10:44:41AM +0200, Norbert Koch wrote:
> Hello.
> 
> When trying to create the coredump of a running
> process (without killing it) under FreeBSD 10.3
> I am seeing a somewhat strange behaviour.
Try this on HEAD or stable/11.  There were a lot of changes and bugfixes
in ptrace(2).

I do not claim that the behaviour you see has changed, but 10.3 is too
diverged from the code where developers would be willing to look at.
> 
> As I want to see the state of all threads, the q&d way
> of fork() + SIGABRT does not work for me.
> 
> So, what I do is having a supervisor program waiting for SIGUSR1.
> When my application signals the wish to be coredumped
> it sends SIGSTOP to itself immediately after sending SIGUSR1.
> The supervisor then forks gcore.
> 
>  From what I can see using top, my application immediately starts
> again as if SIGCONT has been received while gcore hangs in wait.
SIGCONT cannot be blocked, otherwise programs could create unkillable
processes.

> 
> Gcore calls ptrace(PT_ATTACH) followed by waitpid().
> So I assume that the ptrace call restarts my application
> and waitpid hangs (why?).
> 
> If I manually send SIGCONT to my stopped application
> immediately before exec-ing gcore, the coredump is being
> created, but for obvious reasons  not as consistent as
> I want it to be.
> 
> I should add that in my application most other signals are
> blocked. Blocking (or not) SIGCONT seems to have no effect.
> 
> Am I doing something wrong here? If yes, ist there
> a different/better/more elegant way of creating a consistent coredump?

What is the purpose of sending SIGSTOP to itself ? Practically, it is no
different than the action of ptrace(PT_ATTACH): all threads are parked
at some safe place in the kernel, or are forcibly moved into the kernel
mode by sending IPI if executing in userspace on other cores. To get
into the safe place in kernel, threads often need to execute some more.
IPI delivery is also not guaranteed to occur in the deterministic place
("at next instruction boundary"), it happens as hardware reacts to it.
As you see, the process is very asynchronous, it cannot guarantee that
the final snapshot is consistent with arbitrary thread state at the
point of request, but it does represent the valid process state assuming
that the thread are executing async.

More, ptrace(PT_ATTACH) currently operates not only by a mechanism to
similar to SIGSTOP, it really sends SIGSTOP to the debuggee. We do not
track nested SIGSTOPs, process is either stopped or runnable. So I am
not surprised that attaching to stopped process do not occur until the
stopped state established earlier passes away: the debugger waits for
the confirmation from all threads that they are parked at safe place,
but there is no because the threads are already stopped. If threads are
made runnable the acks are sent and the attach completes.

I am explaining this to point out that trying to send SIGSTOP and
then attaching with ptrace(PT_ATTACH) is just worse than doing
ptrace(PT_ATTACH).  I think you need to have supervisor either
directly execute gcore(1) without SIGSTOP, or execute ptrace(PT_ATTACH)
instead of kill(SIGSTOP), and have gcore functionality embedded into the 
it.  The consistency of the generated core is actually same.