From owner-freebsd-hackers@FreeBSD.ORG  Fri Apr 15 15:18:20 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from green.homeunix.org (freefall.freebsd.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP
	id 77CFD16A4CE; Fri, 15 Apr 2005 15:18:20 +0000 (GMT)
Received: from green.homeunix.org (green@localhost [127.0.0.1])
	by green.homeunix.org (8.13.3/8.13.1) with ESMTP id j3FFJ4l3003148;
	Fri, 15 Apr 2005 11:19:04 -0400 (EDT)
	(envelope-from green@green.homeunix.org)
Received: (from green@localhost)
	by green.homeunix.org (8.13.3/8.13.1/Submit) id j3FFJ4x9003147;
	Fri, 15 Apr 2005 11:19:04 -0400 (EDT)
	(envelope-from green)
Date: Fri, 15 Apr 2005 11:19:04 -0400
From: Brian Fundakowski Feldman <green@freebsd.org>
To: Bill Vermillion <bv@wjv.com>
Message-ID: <20050415151904.GR981@green.homeunix.org>
References: <20050415120104.AD04C16A4CF@hub.freebsd.org>
	<20050415141052.GB96815@wjv.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20050415141052.GB96815@wjv.com>
User-Agent: Mutt/1.5.6i
cc: freebsd-hackers@freebsd.org
Subject: Re: immenent disk failure ?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Apr 2005 15:18:20 -0000

On Fri, Apr 15, 2005 at 10:10:52AM -0400, Bill Vermillion wrote:
> On or about Fri, Apr 15, 2005 at 12:01 , while attempting a 
> Zarathustra emulation freebsd-hackers-request@freebsd.org thus spake:
> 
> 
> 
> > Message: 4
> > Date: Thu, 14 Apr 2005 10:58:02 -0500 (CDT)
> > From: "H. S." <security@revolutionsp.com>
> > Subject: imminent disk failure ?
> 
> ...
> 
> > I have a server running 4.X for almost two years now, without
> > problems - rock solid as it should be - yesterday the server
> > became unresponsive, now that I have access again, and while
> > checking the logs, I found this as the last message before the
> > unresponsiveness:
> 
> > /kernel: ad0: READ command timeout tag=0 serv=0 - resetting
> 
> > The next message is the system getting back on, 1hour later.
> 
> > I have not changed anything kernel-related on this system for
> > a long time (jul 2004), just apply the occasional kernel patch
> > and rebuild/reboot the system. I never encountered this problem
> > before. Could this message mean this disk is giving its last
> > breaths ?
> 
> It might help if we knew a bit more about the system such
> a drive make and model - you can see that in dmesg.  That may
> point out some device that is known to be problematic.
> 
> The last time I got timeout errors like that was in the 3.x era
> with a SCSI controller.   Last IDE problem I had was a bad read
> that force the system into PIO mode with over 75% performance
> decrease.   The only way around that one that I was aware of was a
> reboot.

For any disk within perhaps the last five years you should be able to
just use SMART to perform a thorough health test on your hard drives
and view their statistics and error logs.  I don't know why it doesn't
currently do much on SCSI, but ports/sysutils/smartmontools works
great for ATA.

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green@FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\