From owner-freebsd-hackers@freebsd.org Tue Jul 5 14:43:56 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16DDEB73F28 for ; Tue, 5 Jul 2016 14:43:56 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id E733E16B6 for ; Tue, 5 Jul 2016 14:43:55 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mailman.ysv.freebsd.org (Postfix) id E328DB73F23; Tue, 5 Jul 2016 14:43:55 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E2D94B73F22 for ; Tue, 5 Jul 2016 14:43:55 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com [IPv6:2a00:1450:400c:c09::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6EE4416B4 for ; Tue, 5 Jul 2016 14:43:55 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x22e.google.com with SMTP id a66so155961516wme.0 for ; Tue, 05 Jul 2016 07:43:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=2UxvES/Nt7XSNT8IovIwmSfj/iLLw0k+Fi0kBHPpE6I=; b=IFLFUwDx8zKlMBfILlOHHuCZXbEzUwnoVgM6s/xCae0Zn6Cs8RctxH8G56cOU3Yl1Q JW4KXOS6chooPaGgE7MMdk9+CfEAzIBIBDn7SVxnLApXv7v1Mje6kXVI4q4FU+4roQjI 2tiJyp2BLf+iRaZ5W+EJYuyY8dGpSjVb8z23HqxR+8QI380vxS62n8oNKH35JC+KtpXO y7pyo9L/86xb/ASgSG6q0ANQRUA5fprNfN4zvu1zWaYT3ZI6AmMzkCEBoUQ9QtPn8kCs vsEBVGGrSBHqGI8/TmgumYS0nn3q/wU4QcArcSM5Uo1OtOEbojuXeuORV33j2YxcBfnb kiaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=2UxvES/Nt7XSNT8IovIwmSfj/iLLw0k+Fi0kBHPpE6I=; b=Jr607GVVhqjXPop7ThIuWcnZ+9nuuX4JxDz+Zvfpsgng9CVO+BDce8Xx6RpewJJbXa h4/pYdcOIjk20mW5tDtc3cW634Wym8P24LClDAhv5uvmYmjNuaLNyZ6BW/wJ7QJpioed bAgZ9t4rGHuywafEOHTnNrQ5khHYuuqI6Qlf3rKKqimbSVkyNwmyi4mt9qlP8R8+it3c iw1PMKDvSEwnIP1217312KNUGHjSfdoIl5PTNlEl0yHUMqKn+c3x8/y6AmSai+KHbtXx qGzaOABXIhLCvOG7ZSuHTu1/Wt63G9VJS8yNet7sVt1S46z08e+XA8ZfpljaY3fc2XED dk3Q== X-Gm-Message-State: ALyK8tJ+XCb3KwTX6qedYY/A3/2H8ZhlUDgH2BpZflTvqhjYYnNZCGAYc9RcLSVr4mA/4BaS2pL1/ZtQ+3bGaJRI X-Received: by 10.194.150.167 with SMTP id uj7mr16004821wjb.168.1467729833520; Tue, 05 Jul 2016 07:43:53 -0700 (PDT) MIME-Version: 1.0 Sender: sobomax@sippysoft.com Received: by 10.194.96.173 with HTTP; Tue, 5 Jul 2016 07:43:52 -0700 (PDT) In-Reply-To: <20160705114808.GN38613@kib.kiev.ua> References: <20160705114808.GN38613@kib.kiev.ua> From: Maxim Sobolev Date: Tue, 5 Jul 2016 07:43:52 -0700 X-Google-Sender-Auth: E-lV_8x0x9_v8XbroxQzrseI7GE Message-ID: Subject: Re: A faulty program corrupts some its data preventing correct core generation (Failed to write core file for process postgres (error 14)) To: Konstantin Belousov Cc: stable@freebsd.org, hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2016 14:43:56 -0000 Seems like candidate for the MFC into releng/10.3 and appropriate errata entry? -Max On Tue, Jul 5, 2016 at 4:48 AM, Konstantin Belousov wrote: > On Mon, Jul 04, 2016 at 10:26:25PM -0700, Maxim Sobolev wrote: > > Hi all, investigating some random postgresql-9.1.21 server crashes on > > FreeBSD 10.3, we've started seeing those after upgrading from postgres > > 9.1.18 on more than one system, so hardware (e.g. RAM issues) are very > > unlikely. I suspect that postgres is at fault, however I am also curious > > how could it be that kernel is not capable of generating core file when > > application does something silly? Is it that some ELF-related data > > structures got corrupted or something else? Are we protecting the page > > where ELF header is mapped with R/O flag? I am looking at possibly > > recreating this by poking around elf header(s), seeing if I can corrupt > it > > in a similar manner reliably, any pointers or suggestions are > appreciated. > > > > Jun 27 04:10:18 dal12 kernel: Failed to write core file for process > > postgres (error 14) > > Jun 27 04:10:18 dal12 kernel: pid 41361 (postgres), uid 70: exited on > > signal 11 > > Jul 1 05:21:46 dal12 kernel: Failed to write core file for process > > postgres (error 14) > > Jul 1 05:21:46 dal12 kernel: pid 1722 (postgres), uid 70: exited on > signal > > 11 > > > > #define EFAULT 14 /* Bad address */ > > > > The resulting files are truncated and is not really usable for anything. > > We've seen the same issue > > > > -rw------- 1 pgsql wheel 1310720 Jun 27 04:10 > postgres.41361.core > > -rw------- 1 pgsql wheel 1310720 Jul 1 05:21 > postgres.1722.core > > > > [ssp-root@dal12 /var/tmp]$ sudo gdb711 postgres postgres.1722.core > > GNU gdb (GDB) 7.11 [GDB v7.11 for FreeBSD] > > Copyright (C) 2016 Free Software Foundation, Inc. > > License GPLv3+: GNU GPL version 3 or later < > http://gnu.org/licenses/gpl.html > > > > > This is free software: you are free to change and redistribute it. > > There is NO WARRANTY, to the extent permitted by law. Type "show > copying" > > and "show warranty" for details. > > This GDB was configured as "x86_64-portbld-freebsd10.3". > > Type "show configuration" for configuration details. > > For bug reporting instructions, please see: > > . > > Find the GDB manual and other documentation resources online at: > > . > > For help, type "help". > > Type "apropos word" to search for commands related to "word"... > > Reading symbols from postgres...(no debugging symbols found)...done. > > BFD: Warning: /var/tmp/postgres.1722.core is truncated: expected core > file > > size >= 517120000, found: 1310720. > > [New LWP 100261] > > Core was generated by `postgres'. > > Program terminated with signal SIGSEGV, Segmentation fault. > > #0 0x0000000800cfba67 in ?? () from /lib/libthr.so.3 > > (gdb) where > > #0 0x0000000800cfba67 in ?? () from /lib/libthr.so.3 > > Backtrace stopped: Cannot access memory at address 0x7fffffffdd08 > > (gdb) q > > > https://lists.freebsd.org/pipermail/freebsd-stable/2016-June/084877.html > >