From owner-freebsd-hackers  Wed Oct  9 00:18:56 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id AAA25558
          for hackers-outgoing; Wed, 9 Oct 1996 00:18:56 -0700 (PDT)
Received: from irz301.inf.tu-dresden.de (irz301.inf.tu-dresden.de [141.76.1.11])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id AAA25545
          for <freebsd-hackers@freebsd.org>; Wed, 9 Oct 1996 00:18:51 -0700 (PDT)
Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by irz301.inf.tu-dresden.de (8.6.12/8.6.12-s1) with ESMTP id JAA19570; Wed, 9 Oct 1996 09:18:06 +0200
Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id JAA15959; Wed, 9 Oct 1996 09:18:05 +0200
Received: (from j@localhost) by uriah.heep.sax.de (8.7.5/8.6.9) id JAA24802; Wed, 9 Oct 1996 09:09:52 +0200 (MET DST)
From: J Wunsch <j@uriah.heep.sax.de>
Message-Id: <199610090709.JAA24802@uriah.heep.sax.de>
Subject: Re: MEDIUM ERROR and HARDWARE FAILURE messages
To: freebsd-hackers@freebsd.org (FreeBSD hackers)
Date: Wed, 9 Oct 1996 09:09:52 +0200 (MET DST)
Cc: pius@iago.ienet.com (Pius Fischer)
Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch)
In-Reply-To: <199610090240.TAA17268@iago.ienet.com> from Pius Fischer at "Oct 8, 96 07:40:03 pm"
X-Phone: +49-351-2012 669
X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F  93 21 E0 7D F9 12 D6 4E 
X-Mailer: ELM [version 2.4ME+ PL17 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

As Pius Fischer wrote:

> sd0(ahc0:3:0): HARDWARE FAILURE info:24e851 asc:15,1 Mechanical positioning error
> , retries:4
> sd0(ahc0:3:0): MEDIUM ERROR info:24e851 asc:11,0 Unrecovered read error
> , retries:3

Uh-oh.

> There are many more MEDIUM ERROR messages than HARDWARE FAILURE messages.
> However, I haven't experienced any loss of data yet or noticed anything
> else bad.

Not yet.  Soon to come.  As long as repositioning the heads in the
drive finds the correct track again, the retries do suffice.  But as
soon as the drive gets worse, all is lost.  If i were you, i would get
a better sleep at night by doing _very_ frequent backups...

> to do. It was suggested to use scsi(8) to "enable the automatic remapping
> of bad sectors" with something like "scsi -f /dev/rsd0a.ctl -m 1 -e -P 3".
> Is this still applicable?

It is applicable for bad sectors (magnetic surface defects that are
basically quite normal for all magnetic storage media), but it won't
help in case of true hardware errors like you're experiencing.

> Is the drive going bad? Anything one can do to fix it?

If it's still under warranty, have it replaced.  The above error
messages are basically the plain SCSI error messages, so anybody at
their tech support who really knows about the SCSI specs (and
terminology) should know what they mean.  The `info' field is likely
to be the block number in question (in hex).

As long as the drive always finds the correct track and block after a
couple of retries, nothing is totally lost.  However, i wouldn't trust
this drive anymore.  (I too used to live with a drive that experienced
occasional errors, but then, i've quickly got another smaller drive to
put my /home and other valuable data on.)

-- 
cheers, J"org

joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)