Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Oct 1997 09:18:17 -0700 (PDT)
From:      "Justin T. Gibbs" <gibbs@FreeBSD.ORG>
To:        uhclem.bsd@FreeBSD.ORG, gibbs@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG, gibbs@FreeBSD.ORG
Subject:   Re: kern/4686
Message-ID:  <199710041618.JAA05568@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
Synopsis: SCSI driver gradually remaps entire drive/false read errors? - FDIV073

State-Changed-From-To: open-closed
State-Changed-By: gibbs
State-Changed-When: Sat Oct 4 08:38:51 PDT 1997
State-Changed-Why: 
Neither the aic7xxx driver nor the generic SCSI layer ever
manually remap sectors on a drive.  I'd be interested in knowing
what code in the aic7xxx driver you modified to remove block
reassignment, as that feature doesn't exist.

As to the aic7xxx/SCSI layer incorrectly reporting media errors,
this just isn't possible.  The aic7xxx driver simply returns the
sense information that the drive provides, and in this case, it
is telling us that it believes it's media is bad.

So, if the aic7xxx driver isn't remapping your sectors, who is?
The drive will remap sectors automatically if the AWRE
(Auto Write Reallocation Enbld) and/or ARRE (Auto Read Reallocation Enbld)
bits are set in mode page 0xC, on the drive.  I think that Quantum drives
ship with these on by default.

Why don't these "bad blocks" show up during a "media verify" operation?
During a verify, media is accessed via logical block number.  When a sector
is remapped, you will get the new, remapped, block during the verify, not
the original block.  If you want to see what blocks were remapped, look
at the "Grown Defect List".  The SCSI-2 spec tells you how to do this
if you don't have a utility that will do it for you.

As to your point about bus resets being dangerous if multiple targets are
active, this really isn't the case.  Bus resets are used to recover devices
that don't seem to be responding and the driver/generic SCSI layer can deal
with the consequences of a bus reset if the code believes it is necessary.
My guess is that the bus reset was performed because some other transaction
to the drive was delayed while it attempted internal retries to retreive the
bad block.  The driver should have attempted to abort the transaction before
it threw the bus reset, but that recovery action must have failed.

One thing I do know about the FireBall ST is that the currently
shipping firmware can become unstable under certain, rapid, seek
patterns.  I doubt that a 1542 is capable of generating the load
required to see this problem.  We use these drives in the Pluto
Digital Space Recorder and had to work with Quantum for several
weeks until they believed that the bug was indeed their fault and
tracked the problem down to their servo code.  In our case, the
drives simply stopped functioning.  I don't think anyone here
checked the grown defect list to see if the drives were remapping
sectors.  Unfortunately Pluto is not autorized to release the
firmware we've been given, but it was my understanding that this
fix would be made available in the next release of the ST firmware.
You might try contacting Quantum Tech support to see if you can
obatain new firmware in advance.


Responsible-Changed-From-To: freebsd-bugs->gibbs
Responsible-Changed-By: gibbs
Responsible-Changed-When: Sat Oct 4 08:38:51 PDT 1997
Responsible-Changed-Why: 
My driver.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710041618.JAA05568>