From owner-freebsd-questions@freebsd.org Mon Aug 22 16:19:51 2016 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 722B4BC2822 for ; Mon, 22 Aug 2016 16:19:51 +0000 (UTC) (envelope-from robert@webtent.org) Received: from mx2.webtent.net (mx2.webtent.net [216.139.202.4]) by mx1.freebsd.org (Postfix) with ESMTP id 4AFC1167B for ; Mon, 22 Aug 2016 16:19:51 +0000 (UTC) (envelope-from robert@webtent.org) Received: from localhost (localhost [127.0.0.1]) by mx2.webtent.net (WebTent ESMTP Postfix Internet Mail Exchange) with ESMTP id 58CBAD7D49; Mon, 22 Aug 2016 12:10:17 -0400 (EDT) Received: from mx2.webtent.net ([127.0.0.1]) by localhost (mx2.webtent.net [127.0.0.1]) (maiad, port 10024) with ESMTP id 16407-01; Mon, 22 Aug 2016 12:10:17 -0400 (EDT) Received: from [192.168.1.105] (unknown [96.254.71.164]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: robert@mx2.webtent.net) by mx2.webtent.net (WebTent ESMTP Postfix Internet Mail Exchange) with ESMTPSA id DB18BD7CF1; Mon, 22 Aug 2016 12:10:16 -0400 (EDT) Message-ID: <57BB23E8.10906@webtent.org> Date: Mon, 22 Aug 2016 12:10:16 -0400 From: Robert Fitzpatrick User-Agent: Postbox 4.0.8 (Windows/20151105) MIME-Version: 1.0 To: David Christensen CC: freebsd-questions@freebsd.org Subject: Re: Monitoring server for crashes References: <57ADDA5F.4000405@webtent.org> <61294.128.135.52.6.1471013465.squirrel@cosmo.uchicago.edu> <57ADF096.8010608@webtent.org> <11590.128.135.52.6.1471018231.squirrel@cosmo.uchicago.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: WebTent Mailguard 1.0.3 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Aug 2016 16:19:51 -0000 David Christensen wrote: > I would advise testing before changing anything. The strategy is: > devise a reproducible test that invokes the bug, use the test to isolate > the bug, fix the bug, re-run test to verify the bug is fixed, re-run the > test periodically to verify that that bug has not returned. Sorry so long to post back, yes, I decided this approach was best and found full backup of the PostgreSQL database causes crash while backing up individual databases (including the largest) does not. Perhaps Postgres just provides the environment to trigger the crash and might not be the culprit, but it's the only way I can reproduce at the moment. I did find the IBM specs for memory of that board and have not compared yet to the box, maybe soon. Just slammed with work and this box is replicated. While I don't want it down, I can take my time finding the real solution if workaround exists. If memory is not right, will change and re-run the full backup of postgres. This box is currently running FreeBSD 10.0-RELEASE-p18. -- Robert