From owner-freebsd-questions@FreeBSD.ORG Mon Sep 30 18:31:41 2013 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 52B1817F for ; Mon, 30 Sep 2013 18:31:41 +0000 (UTC) (envelope-from freebsd-questions@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1207D2BA3 for ; Mon, 30 Sep 2013 18:31:40 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1VQiFo-0005jX-Tc for freebsd-questions@freebsd.org; Mon, 30 Sep 2013 20:31:32 +0200 Received: from pool-173-79-84-117.washdc.fios.verizon.net ([173.79.84.117]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 30 Sep 2013 20:31:32 +0200 Received: from nightrecon by pool-173-79-84-117.washdc.fios.verizon.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 30 Sep 2013 20:31:32 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-questions@freebsd.org From: Michael Powell Subject: Re: cause of reboot Date: Mon, 30 Sep 2013 14:31:19 -0400 Lines: 47 Message-ID: References: <519911380551058@web20j.yandex.ru> <20130930190944.281aa46d@davenulle.org> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: pool-173-79-84-117.washdc.fios.verizon.net X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: nightrecon@hotmail.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Sep 2013 18:31:41 -0000 Patrick Lamaiziere wrote: [snip] >> >> I looked "last" command, >> reboot ~ ~ AM 03.15 ~ > > The last time It happened (one month ago) to me it was the hard disk > (periodic scripts read a large part of the disk). > > If the disk is smart capable try a full test with smartctl > (sysutils/smartmontools) My gateway/firewall/mail/ids router box at home has 2 GB RAM in it, so normally it has enough extra room that nothing ever pushes over into swap with one exception: the periodic run at 0300. It is generally never more than just a few kilobytes, but I find it slightly surprising nonetheless. If a sector (or more) on the drive that is backing the swap partition has gone bad it might not even be noticeable until something pages out to swap (like my 0300 periodic run). If the drive is a WD the 'Quick' test using the manufacturers' wddiags utility should spot it, and is non-destructive. I have occasionally seen the full test not destroy data - but I wouldn't count on it being non- destructive. However, as long as the remap area isn't full the long test will repair the drive by relocating and mapping out the bad spot. When this silent fading away of magnetic media occurs wrt to higher-end RAID controllers the scrub function in the controller BIOS is where you would want to go. The other problem relative to this that I've run into is the apple before the cart syndrome around backups. I have seen dump fail to allow for backing up data prior to using the full wddiags to repair a drive so you kinda get stuck. If the full test is going to wipe the drive and you can't generate a fresh current backup you're stuck only being able to restore whatever is the last good backup you have on hand. Wouldn't surpise me at all if this were to turn out to be the drive just recently grew one or more bad spots. A bad spot or few on an old drive that gets repaired I might continue to use the drive for a while, maybe even for like a year time-frame wise. If 2 months later it starts growing more bad spots the drive goes in the rubbish bin. -Mike