From owner-freebsd-hackers@freebsd.org Fri Oct 27 08:47:06 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3FCE1E3C15A for ; Fri, 27 Oct 2017 08:47:06 +0000 (UTC) (envelope-from nkoch@demig.de) Received: from exch.demig.de (exch.demig.de [87.128.30.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EBC9980FC3 for ; Fri, 27 Oct 2017 08:47:05 +0000 (UTC) (envelope-from nkoch@demig.de) Received: from [192.168.148.248] (port=47797 helo=SRV-FS-2.Demig.intra) by exch.demig.de with esmtps (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.82_1-5b7a7c0-XX) (envelope-from ) id 1e80Fz-0004kJ-1N for freebsd-hackers@freebsd.org; Fri, 27 Oct 2017 10:44:47 +0200 Received: from [192.168.148.215] (192.168.148.215) by SRV-FS-2 (192.168.148.248) with Microsoft SMTP Server (TLS) id 14.3.361.1; Fri, 27 Oct 2017 10:44:42 +0200 X-CTCH-RefID: str=0001.0A0B0201.59F2F1FF.00A6, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 From: Norbert Koch Subject: crerating coredump of multithreaded process To: Message-ID: Date: Fri, 27 Oct 2017 10:44:41 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-C2ProcessedOrg: e1e98c77-ec17-4cb1-9b24-fe57656077ed X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Oct 2017 08:47:06 -0000 Hello. When trying to create the coredump of a running process (without killing it) under FreeBSD 10.3 I am seeing a somewhat strange behaviour. As I want to see the state of all threads, the q&d way of fork() + SIGABRT does not work for me. So, what I do is having a supervisor program waiting for SIGUSR1. When my application signals the wish to be coredumped it sends SIGSTOP to itself immediately after sending SIGUSR1. The supervisor then forks gcore. From what I can see using top, my application immediately starts again as if SIGCONT has been received while gcore hangs in wait. Gcore calls ptrace(PT_ATTACH) followed by waitpid(). So I assume that the ptrace call restarts my application and waitpid hangs (why?). If I manually send SIGCONT to my stopped application immediately before exec-ing gcore, the coredump is being created, but for obvious reasons not as consistent as I want it to be. I should add that in my application most other signals are blocked. Blocking (or not) SIGCONT seems to have no effect. Am I doing something wrong here? If yes, ist there a different/better/more elegant way of creating a consistent coredump? Thank you. --=20 Dipl.-Ing. Norbert Koch Entwicklung Prozessregler ***************************************** * demig Prozessautomatisierung GmbH * * * * Anschrift: Haardtstrasse 40 * * D-57076 Siegen * * Registergericht: Siegen HRB 2819 * * Geschaeftsfuehrer: Joachim Herbst, * * Winfried Held * * Telefon: +49 271 772020 * * Telefax: +49 271 74704 * * E-Mail: info@demig.de * * http://www.demig.de * *****************************************