From owner-freebsd-scsi  Thu Jun 17 19:37:21 1999
Delivered-To: freebsd-scsi@freebsd.org
Received: from panzer.plutotech.com (panzer.plutotech.com [206.168.67.125])
	by hub.freebsd.org (Postfix) with ESMTP id ED192155D7
	for <freebsd-scsi@FreeBSD.ORG>; Thu, 17 Jun 1999 19:37:14 -0700 (PDT)
	(envelope-from ken@panzer.plutotech.com)
Received: (from ken@localhost)
          by panzer.plutotech.com (8.9.3/8.8.5) id UAA92294;
          Thu, 17 Jun 1999 20:37:12 -0600 (MDT)
From: "Kenneth D. Merry" <ken@plutotech.com>
Message-Id: <199906180237.UAA92294@panzer.plutotech.com>
Subject: Re: suspected bad block
In-Reply-To: <199906180209.VAA43562@nospam.hiwaay.net> from David Kelly at "Jun 17, 1999 09:09:31 pm"
To: dkelly@HiWAAY.net (David Kelly)
Date: Thu, 17 Jun 1999 20:37:11 -0600 (MDT)
Cc: freebsd-scsi@FreeBSD.ORG
X-Mailer: ELM [version 2.4ME+ PL54 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

David Kelly wrote...
> This is the first time something like this has popped up:
> Jun 16 06:02:16 nospam /kernel: (da0:ncr0:0:0:0): READ(10). CDB: 28 0 1 9 b6 2 0 0 2 0 
> Jun 16 06:02:16 nospam /kernel: (da0:ncr0:0:0:0): MEDIUM ERROR info:109b603 asc:11,b
> Jun 16 06:02:16 nospam /kernel: (da0:ncr0:0:0:0): Unrecovered read error - recommend reassignment sks:80,2f
> 
> Think "make release" was running at the time the above happened.
> 
> "sks:80,2f" doesn't look like a block number to me. What is it? Is this 
> junk being reported by my HD?

The sks (sense key specific) value for errors with a MEDIUM ERROR sense
key referrs to the drive's actual retry count for the command.

The info field above shows the block.  The block number with the problem is
0x109b603.  From the READ CDB above, we know that it was a 2-block read
request starting at block 0x109b602.  So, the block number makes sense.

> This is what dmesg has to say about the drive:
> da0 at ncr0 bus 0 target 0 lun 0
> da0: <IBM OEM DCHS09W 2222> Fixed Direct Access SCSI-2 device 
> da0: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled
> da0: 8689MB (17796077 512 byte sectors: 255H 63S/T 1107C)
> 
> Am running "dd if=/dev/rda0c ibs=64k of=/dev/null" on the device right 
> now to see if I can prompt the problem again. Or if the drive has 
> already remapped it.

You'll probably see the error again.  The drive can generally only remap a
block if it has good data to put in place of the data it can't retrieve. 
Sometimes it can use ECC data to reconstruct the block, but if it can't, it
probably won't be able to remap it.  (There are rules the drive follows,
which can be adjusted in mode page 1.  Read the SCSI spec for details.)

> Have mentioned in the past, camcontrol isn't quite 
> happy with the above HD and Asus SC875 SCSI card:
> 
> # date;  camcontrol defects -f block -G ; sleep 10; tail -5 /var/log/messages ; date
> Thu Jun 17 21:04:39 CDT 1999
> error reading defect list: Input/output error
> Jun 17 21:03:47 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84800.
> Jun 17 21:04:39 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
> Jun 17 21:04:39 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400.
> Jun 17 21:04:39 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
> Jun 17 21:04:40 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400.
> Thu Jun 17 21:04:49 CDT 1999

Yeah, I think I remember your saying something about it.  I will say that
IBM drives typically don't like the block defect format.  The physical
sector format usually works for most drives.  Quantum disks often work with
the block format, but Seagate and IBM don't.

It would help if I could figure out what the error above from the NCR
driver means.  Maybe if Stefan or Gerard are reading this, they might be
able to help.

If I get time this weekend (and if I can remember) I may hook up a disk to
an NCR controller and see if I can reproduce this.

Assuming you've got AWRE turned on, you can force the drive to remap the
block, probably like this:

camcontrol cmd -n da -u 0 -v -c "2a 0 1 09 b6 03 0 0 1 0" -o 512 - < /dev/null

Use the above command at your own risk.  It should write one block worth of
nulls to the bad sector in question, but I might have gotten it wrong.

Ken
-- 
Kenneth Merry
ken@plutotech.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message