From owner-freebsd-questions@freebsd.org Thu Jun 1 14:35:04 2017 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 43EE4AFE36D for ; Thu, 1 Jun 2017 14:35:04 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from sola.nimnet.asn.au (paqi.nimnet.asn.au [115.70.110.159]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8E1E77C0D6 for ; Thu, 1 Jun 2017 14:35:03 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from localhost (localhost [127.0.0.1]) by sola.nimnet.asn.au (8.14.2/8.14.2) with ESMTP id v51EYh78061963; Fri, 2 Jun 2017 00:34:44 +1000 (EST) (envelope-from smithi@nimnet.asn.au) Date: Fri, 2 Jun 2017 00:34:43 +1000 (EST) From: Ian Smith To: Raimo Niskanen cc: freebsd-questions@freebsd.org Subject: Re: Advice on kernel panics In-Reply-To: Message-ID: <20170601235447.C98304@sola.nimnet.asn.au> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jun 2017 14:35:04 -0000 In freebsd-questions Digest, Vol 678, Issue 4, Message: 4 On Thu, 1 Jun 2017 10:27:49 +0200 Raimo Niskanen wrote: > On Thu, Jun 01, 2017 at 12:10:30AM -0500, Doug McIntyre wrote: > > On Mon, May 29, 2017 at 11:20:43AM +0200, Raimo Niskanen wrote: > > > I have a server that panics about every 3 days and need some advice on how > > > to handle that. > > > > I'd expect it is some sort of hardware failure, as I would expect > > kernel panics more on the order of once a decade with FreeBSD. Ie. > > I've seen one or two on my hundred or so servers, but its pretty rare. > > > > Check and recheck your hardware items. > > I have removed one of four memory capsules - panicked again. Will rotate > through all of them... > > > > > Runup memtest86+. Check your drive hardware, turn on SMART checking. > > I have run memtest86+ over night - no errors found. > > I have installed smartmontools - no errors found, short and long self tests > on both disks run fine. zpool scrub repaired 0 errors and has no known data > errors. Everyone's suggesting hardware problems, and it's certainly worthwhile eliminating that possibility - but this could be a software/OS issue. If it were me and hardware all checks out, I'd try posting the original report - plus other details about the box and setup that you've since mentioned - to freebsd-stable@, or maybe freebsd-fs@ since those fstat reports seem to point to possible FS/zfs issues? at a wild guess .. One other hardware tester you might try is sysutils/stress which can pound CPU, I/O, VM, disk as hard and for as long as you like, without having to bring the box down. I've used this lots to generate heavy loads. Keep a close eye on system temperatures during longer tests. Ah, just before posting, I see your latest with dmesg. Just on a quick scan, I wonder if these are a bad indication? Maybe just a side-issue, but powerd might not work, so again heat might be something to watch: est0: on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. cheers, Ian