Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Oct 1996 07:52:40 -0800 (PST)
From:      "Rodney W. Grimes" <rgrimes@GndRsh.aac.dev.com>
To:        joerg_wunsch@uriah.heep.sax.de
Cc:        freebsd-scsi@FreeBSD.org, petzi@zit.th-darmstadt.de
Subject:   Re: SCSI harddisk trouble: MEDIUM ERROR
Message-ID:  <199610271552.HAA18645@GndRsh.aac.dev.com>
In-Reply-To: <199610270844.JAA25043@uriah.heep.sax.de> from J Wunsch at "Oct 27, 96 09:44:12 am"

next in thread | previous in thread | raw e-mail | index | archive | help
> As Michael Beckmann wrote:
> 
> > Oct 27 03:26:59 zit1 /kernel: sd0(ncr0:0:0): MEDIUM ERROR info:3b13fe asc:11,b
> > Oct 27 03:26:59 zit1 /kernel: sd0(ncr0:0:0):  Unrecovered read error -
> > recommend reassignment sks:80,32
> 
> > I took the disk out of the system, and checked its configuration. AWRE and
> > ARRE were set. I thought the disk was supposed to automatically reassign
> > bad blocks in this configuration ?
> 
> Yep, it's weird that it doesn't.
> 
> > fsck gave me a decimal block number for the bad block, how can I tell which
> > block is bad here ?
> 
> The `info' field in a MEDIUM ERROR response is supposed to be the
> (hexadecimal) block number of the failure, thus it's the total block
> 0x3b13fe = 3871742.
> 
> On the same note, fsck reports the block numbers relative to the start
> of the filesystem, i think.
> 
> > Is there a way to run a disk check under FreeBSD, which
> > finds (and reassigns) bad blocks ?
> 
> People have been asking for such a tool, but it's a little of work to
> do it that nobody did yet...
> 
> > Is there a good way to reassign blocks
> > automatically, and avoid the problems that follow these medium errors ?
> 
> ARRE :)  Too bad it doesn't work for you.


ARRE CAN NOT REASSIGN A HARD READ ERROR, that would result in data loss,
drives are not allowed to reassign the block automatically if it would
mean data loss.

Try 5 passes of this:
	dd if=/dev/rsd0 of=/dev/null seek=3871740 count=4 bs=512

if it still gives you the error the data has been loss, it which case
a write operation to the block will cause AWRE to fix it:

	BACK UP ALL DATA BEFORE ATTEMPTING THE FOLLOWING!!
	dd if=/dev/zero of=/dev/rsd0 bs=512 count=1 seek=381742

The a manual fisk just incase this was a meta data block.

> If you can backup the disk, and if it's not a funny one (by some SCSI
> vendor who's cheating), you can reformat it.  I know it at least from
> an older Seagate disk i used to have, which also experienced
> occasional medium errors (despite of ARRE -- apparently, the defect
> list for this zone was full), where reformatting cured the disk until
> it has recently been pensioned since it grew too small for my needs.

People are very confused about just what AWRE and ARRE can and can not
do for you.  Since we don't do read after write verify operations if
a data hard error occurs in the block during the write that is undeteced
during the read there is nothing that AWRE/ARRE can do about it.  Data
that went bad while stored is one case, like from dropping the disk
drive on the floor from 3 feet :-)


-- 
Rod Grimes                                      rgrimes@gndrsh.aac.dev.com
Accurate Automation, Inc.                   Reliable computers for FreeBSD



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610271552.HAA18645>