From owner-freebsd-hackers@freebsd.org Wed Nov 28 15:27:29 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 611F71157463 for ; Wed, 28 Nov 2018 15:27:29 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 899BE78323; Wed, 28 Nov 2018 15:27:28 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 50770A606D; Wed, 28 Nov 2018 16:27:26 +0100 (CET) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dwf-t-vcpqWB; Wed, 28 Nov 2018 16:27:25 +0100 (CET) Received: from [192.168.101.151] (unknown [192.168.101.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 6A855A606C; Wed, 28 Nov 2018 16:27:25 +0100 (CET) Subject: Re: setting distinct core file names To: Konstantin Belousov Cc: cem@freebsd.org, "freebsd-hackers@freebsd.org" References: <84f498ff-3d65-cd4e-1ff5-74c2e8f41f2e@digiware.nl> <7b2b134c-3fd3-6212-b06a-81003361e083@digiware.nl> <20181128144328.GF2378@kib.kiev.ua> From: Willem Jan Withagen Message-ID: Date: Wed, 28 Nov 2018 16:27:26 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20181128144328.GF2378@kib.kiev.ua> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: nl X-Rspamd-Queue-Id: 899BE78323 X-Spamd-Result: default: False [-2.77 / 15.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-0.97)[-0.968,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[digiware.nl]; NEURAL_HAM_MEDIUM(-0.95)[-0.945,0]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: smtp.digiware.nl]; NEURAL_HAM_SHORT(-0.55)[-0.551,0]; IP_SCORE(0.00)[country: NL(0.01)]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:28878, ipnet:2001:4cb8::/29, country:NL]; MID_RHS_MATCH_FROM(0.00)[] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Nov 2018 15:27:29 -0000 On 28/11/2018 15:43, Konstantin Belousov wrote: > On Wed, Nov 28, 2018 at 12:21:33PM +0100, Willem Jan Withagen wrote: >> On 28-11-2018 11:43, Willem Jan Withagen wrote: >>> On 27-11-2018 21:46, Conrad Meyer wrote: >>>> One (ugly) trick is to use multiple filesystem links to the script >>>> interpreter, where the link names distinguish the scripts.  E.g., >>>> >>>> $ ln /bin/sh /libexec/my_script_one_sh >>>> $ ln /bin/sh /libexec/my_script_two_sh >>>> $ cat myscript1.sh >>>> #!/libexec/my_script_one_sh >>>> ... >>>> >>>> Cores will be dumped with %N of "my_script_one_sh." >>> Neat trick... got to try and remember this. >>> But it is not the shell scripts that are crashing... >>> >>> When running Ceph tests during Jenkins building some >>> programs/executables intentionally crash leaving cores. >>> Others (scripts) use some of these programs with correct input and >>> should NOT crash. And test during startup and termination that there are >>> no cores left. >>> >>> One jenkins test run takes about 4 hours when not executed in parallel. >>> I'm testing 4 version multiple times a day to not have this huge list of >>> PRs the go thru when testing fails. >>> >>> But the intentional cores and the failure cores here collide. >>> And when I have a core program_x.core I can't tell if they are from a >>> failure or from an intentional crash. >>> >>> Now if could tell per program  how to name its core that would allow me >>> to fix the problem, without overturning the complete Ceph testing >>> infrastructure and still keep parallel tests. >>> >>> It would also help in that "regular" cores just keep going the way the >>> are. So other application still have the same behaviour. And are still >>> picked up by periodic processing. >> So I read a bit more about the prcctl and prctl(the Linux variant) and >> turns out that Linux can set PR_SET_DUMPABLE. And that is actually used >> in some of the Ceph applications... >> >> Being able to set this to 0 or 1 would perhaps be a nice start as well. > Isn't setrlimit(RLIMIT_CORE, 0) enough ? It is slightly different syntax, > but the idea is that you set RLIMIT_CORE to zero, then we do not even > start dumping. Right, At one point I think I had this code in some tests code.... I also think this is the default on the CentOS when I tested it there. So I set it from the top-shell to propagate. But then I could have run into:      [EPERM]            The limit specified to setrlimit() would have raised                         the maximum limit value, and the caller is not the                         super-user. When do wanting dumps. I'm not sure, it was quite some time ago. But that might be a nice suggestion to look into. --WjW