From owner-freebsd-hackers@FreeBSD.ORG  Sat Dec 18 17:09:02 2004
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4392B16A4CE
	for <freebsd-hackers@freebsd.org>;
	Sat, 18 Dec 2004 17:09:02 +0000 (GMT)
Received: from pimout2-ext.prodigy.net (pimout2-ext.prodigy.net
	[207.115.63.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BBCA243D46
	for <freebsd-hackers@freebsd.org>;
	Sat, 18 Dec 2004 17:08:59 +0000 (GMT)
	(envelope-from julian@elischer.org)
Received: from [192.168.1.102] (adsl-216-100-134-143.dsl.snfc21.pacbell.net
	[216.100.134.143])iBIH8sGr085906;	Sat, 18 Dec 2004 12:08:56 -0500
Message-ID: <41C46426.3090900@elischer.org>
Date: Sat, 18 Dec 2004 09:08:54 -0800
From: Julian Elischer <julian@elischer.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8a3) Gecko/20041017
X-Accept-Language: en, hu
MIME-Version: 1.0
To: Peter Jeremy <PeterJeremy@optushome.com.au>
References: <41C3D62D.7000808@comcast.net>
	<20041218091739.GC97121@cirb503493.alcatel.com.au>
In-Reply-To: <20041218091739.GC97121@cirb503493.alcatel.com.au>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
cc: freebsd-hackers@freebsd.org
cc: Gary Corcoran <garycor@comcast.net>
Subject: Re: Multiple hard disk failures - coincidence ?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Dec 2004 17:09:02 -0000

Peter Jeremy wrote:
> On Sat, 2004-Dec-18 02:03:09 -0500, Gary Corcoran wrote:
> 
>>I've just had *THREE* Maxtor 250GB hard disk failures on my
>>FreeBSD 4.10 server within a matter of days.  One I could
>>attribute to actual failure.  Two made me suspicious.  Three
>>has me wondering if this is some software problem...   (or
>>a conspiracy (just kidding) ;-) )
> 
> 
> Seems unlikely that faulty server software could cause a disk failure.
> One possibility is that your power supply is a but stressed and the
> supply rails are out of tolerance.  The other possibility is that the
> drives are overheating.  Higher density drives will be more sensitive
> to both heat and dirty power.
> 
> 
>> I suppose it
>>is possible these errors may have shown up more than a week or
>>two ago, because my windows machines, reaching them via samba,
>>haven't shown any problems until today, and of course with almost
>>750GB of data, it's not all accessed over a short time span.
> 
> 
> My approach to this is to add a line similar to 
>   dd if=/dev/ad0 of=/dev/null bs=32k
> for each disk into /etc/daily.local (or /etc/weekly.local or whatever).
> This ensures that the disks are readable on a regular basis.
> 
> 
>>P.S. I *can't* be the first person to run into this problem:
>>When one gets a "hard error" reported for a certain block number,
>>how does one find out exactly *which* file or directory is now
>>unreadable?  With hundreds of thousands of megabytes on one disk,
>>a manual search is not practical - somebody must have written a
>>program to 'backtrack' a block number to a particular file name
>>- no?
> 

I generally do a tar cf /dev/lubb  /mountpoint

We have some tools that do teh reverse..
tell you what blocks are in a file..
It should be possible to modify fsck to do the inverse..

fsck -n --findblocks 234234,56546,2342342

> 
> I know I've done this in the past but I don't recall exactly how.
> About all you can do is search through the inode list for the
> relevant blocks and then map the inode numbers to file names.
>