From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 20 18:18:30 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E0AFA106566C
	for <freebsd-fs@freebsd.org>; Fri, 20 Jan 2012 18:18:30 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta11.emeryville.ca.mail.comcast.net
	(qmta11.emeryville.ca.mail.comcast.net [76.96.27.211])
	by mx1.freebsd.org (Postfix) with ESMTP id BFA828FC12
	for <freebsd-fs@freebsd.org>; Fri, 20 Jan 2012 18:18:30 +0000 (UTC)
Received: from omta19.emeryville.ca.mail.comcast.net ([76.96.30.76])
	by qmta11.emeryville.ca.mail.comcast.net with comcast
	id PrJE1i0011eYJf8ABuJWGS; Fri, 20 Jan 2012 18:18:30 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta19.emeryville.ca.mail.comcast.net with comcast
	id PuJV1i00C1t3BNj01uJVFw; Fri, 20 Jan 2012 18:18:30 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 073B1102C19; Fri, 20 Jan 2012 10:18:29 -0800 (PST)
Date: Fri, 20 Jan 2012 10:18:29 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Dennis Glatting <dg@pki2.com>
Message-ID: <20120120181828.GA1049@icarus.home.lan>
References: <alpine.BSF.2.00.1201191604510.19710@kozubik.com>
	<4F192ADA.5020903@brockmann-consult.de>
	<1327069331.29444.4.camel@btw.pki2.com>
	<20120120153129.GA97746@icarus.home.lan>
	<1327077094.29408.11.camel@btw.pki2.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1327077094.29408.11.camel@btw.pki2.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
Subject: Re: sanity check:  is 9211-8i, on 8.3, with IT firmware still "the
 one"
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Jan 2012 18:18:31 -0000

On Fri, Jan 20, 2012 at 08:31:34AM -0800, Dennis Glatting wrote:
> On Fri, 2012-01-20 at 07:31 -0800, Jeremy Chadwick wrote:
> 
> > On Fri, Jan 20, 2012 at 06:22:11AM -0800, Dennis Glatting wrote:
> > > I am having a problem with Seagate ST1000DL002 disks but I haven't yet
> > > determined weather it is the disks themselves (they -- two of them, new
> > > -- fail under a MB controller too.
> > 
> > Assuming the disks are seen directly on the bus (e.g. show up as daX,
> > adaX, or whatever), please install ports/sysutils/smartmontools (make
> > sure you're using version 5.42 or newer) and please provide output from
> > the following command: "smartctl -a /dev/XXX" where XXX is the device
> > name of the ST1000DL002 disk(s).  Please be sure to state which device
> > name is associated with which smartctl output.  You can delete or
> > remove the disk serial numbers from the output (for privacy) if you
> > wish.  I'll be happy to review the data and tell you whether or not the
> > disks themselves are showing problems or if the issue is elsewhere.
> 
> That is the motivation I needed to reboot that system, which was 50%
> through a task. That said, as remains the case today, for the last 20
> years I haven't been able to find that "Any Key" on reboot. :)
> 
> Regardless...

First off, let's start with the full picture.  Readers need to know
exactly what is going on within your controller setup, what disks are
connected to what, etc..  Taken from your full dmesg below, and turned
into something easy-to-read (mostly)

Controller mps0
  --> LSI SAS2008
  --> IRQ 19 on pci1
  --> Firmware 12.00.00.00
  --> Disks attached:
      --> da0  --> WDC WD25EZRS, SATA300
      --> da1  --> WDC WD25EZRS, SATA300
      --> da2  --> WDC WD25EZRS, SATA300
      --> da3  --> WDC WD25EZRS, SATA300
      --> da4  --> WDC WD25EZRS, SATA300
      --> da5  --> WDC WD25EZRS, SATA300
      --> da6  --> WDC WD25EZRS, SATA300
      --> da7  --> WDC WD25EZRS, SATA300

Controller mps1
  --> LSI SAS2008
  --> IRQ 19 on pci5
  --> Firmware 12.00.00.00
  --> Disks attached:
      --> None

Controller mps2
  --> LSI SAS2008
  --> IRQ 16 on pci6
  --> Firmware 12.00.00.00
  --> Disks attached:
      --> da8  --> WDC WD25EZRS, SATA300
      --> da9  --> WDC WD25EZRS, SATA300
      --> da10 --> WDC WD25EZRS, SATA300
      --> da11 --> WDC WD25EZRS, SATA300
      --> da12 --> ST1000DL002, SATA300

Controller ahci0
  --> ATI IXP700 AHCI (4-port)
  --> IRQ 19 on pci0
  --> Disks attached:
      --> ahcich0 --> ada0 --> Corsair Force 3 SSD, SATA600
      --> ahcich1 --> ada1 --> OCZ-AGILITY2 SSD, SATA300
      --> ahcich2 --> ada2 --> ST31000333AS, SATA300

Controller ata0
  --> ATI IXP700/800 ATA133 (2-port/4-device, PATA)
  --> IRQ <???> on pci0
  --> I would assume this would be on IRQ 14 or 15, sigh...
  --> Disks attached:
      --> None

Now that we have a full picture, let's continue.

> An attempt to write to it:
> 
> bd3# dd if=/dev/zero of=/dev/da12
> dd: /dev/da12: Input/output error
> 1+0 records in
> 0+0 records out
> 0 bytes transferred in 0.378153 secs (0 bytes/sec)

The dd command you executed to write zeros to the disk, 512-bytes at
time, starting at LBA 0, failed when writing the first 512 bytes.  So,
from my perspective, writing to LBA 0 is failing.

You should also keep in mind that this dd command to zero the disk (if
it was to work) would take a very long time to complete.  If you used a
larger block size (bs=64k or maybe larger), it would be a lot faster.
Just a tip.  Starting with bs=512 (default) is fine, or in this case
using 4096 would probably be better (see below), but whatever.

> The disk is presently connected  to this device (LSI 9211-8i) but I have
> also had it connected to the devices on the MB and I think to a
> SuperMicro board. I have also tried a different LSI board.

Thanks for sharing this -- this is important information, but let's not
start moving the drive around any more, okay?  There's no point.  The
information you've given is enough, and I'll explain it in detail.

> {snipping for brevity}
> 
> bd3# smartctl -a /dev/da12
> smartctl 5.42 2011-10-20 r3458 [FreeBSD 9.0-STABLE amd64] (local build)
> Copyright (C) 2002-11 by Bruce Allen,
> http://smartmontools.sourceforge.net
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Seagate Barracuda Green (Adv. Format)
> Device Model:     ST1000DL002-9TT153
> Serial Number:    W1V06SLR
> LU WWN Device Id: 5 000c50 037e11be9
> Firmware Version: CC32
> User Capacity:    1,000,204,886,016 bytes [1.00 TB]
> Sector Size:      512 bytes logical/physical
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   8
> ATA Standard is:  ATA-8-ACS revision 4
> Local Time is:    Fri Jan 20 08:22:34 2012 PST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> {snipping for brevity}
> 
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000f   108   099   006    Pre-fail  Always -       241488
>   3 Spin_Up_Time            0x0003   087   070   000    Pre-fail  Always -       0
>   4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always -       28
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always -       0
>   7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always -       136324
>   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always -       576
>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always -       0
>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always -       29
> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always -       0
> 184 End-to-End_Error        0x0032   100   100   099    Old_age   Always -       0
> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always -       0
> 188 Command_Timeout         0x0032   100   100   000    Old_age   Always -       0
> 189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always -       0
> 190 Airflow_Temperature_Cel 0x0022   073   062   045    Old_age   Always -       27 (Min/Max 21/27)
> 191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always -       0
> 192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always -       23
> 193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always -       29
> 194 Temperature_Celsius     0x0022   027   040   000    Old_age   Always -       27 (0 21 0 0 0)
> 195 Hardware_ECC_Recovered  0x001a   027   008   000    Old_age   Always -       241488
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always -       0
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline -      0
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always -       0
> 240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline -      265544943010369
> 241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline -      3746932548
> 242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline -      3212957483
> 
> SMART Error Log Version: 1
> No Errors Logged
>
> {snipping more}

Your SMART attributes here appear perfectly fine.  There is no
indication of bad LBAs (sectors) on the drive, or even "suspect" LBAs on
the drive.  If LBA 0, for example, was actually bad (meaning the sector
itself), that would show up in the SMART error log (most likely), and if
not there, at bare minimum as some form of incremented RAW_VALUE field
in one of many attributes (either 5, 197, or 198; possibly 187, I forget).

SMART attributes 1, 7, and 195 on Seagate drives are always "crazy";
that is to say, they are not incremental counters, they are
vendor-encoded.  smartmontools does not know how to decode some of these
attributes (on SOME Seagate drives it does, on others it doesn't).  I
state this because people read SMART attributes wrong ~70% of the time;
they see non-zero numbers and go "oh my god, it's broken!"  No it isn't.
SMART attribute values/decoding are not part of the ATA specification
(even working draft), so it's all proprietary more or less.

I also want to assume attribute 240 is vendor-encoded as well, probably
as multiple data sets stored within the full 6-byte attribute field;
again, smartmontools doesn't know how to decode this.  I wouldn't worry
about this, again even though the number is huge.  :-)

SMART attribute 184 keeps track of errors occurring between the drive
controller (on the PCB) and the drive cache; there are no cache errors.
That's good, and I'm glad to see vendors implementing this.

SMART attribute 188 indicates the drive itself has not counted any
command timeouts (these would be ATA commands sent from the OS through
the SATA/SAS controller to the drive controller, which timed out at the
phase when the drive attempted to read/write data from a sector).

SMART attribute 199 indicates there are no cabling problems or "physical
issues between the disk and the SATA/SAS controller" (bad connectors,
dust in the connectors, shoddy hot-swap plane, bad port, etc.).

SMART attribute 183 is something I haven't seen before (I'm more
familiar with Western Digital disks), but it also looks fine.

So again: your drive looks perfectly healthy per SMART stats.  But
there's something amusing about this situation that a lot of people
overlook...

> {snipping dmesg for brevity, but here's the URL for readers so they
> can see it themselves:
> http://lists.freebsd.org/pipermail/freebsd-fs/2012-January/013481.html
> }
>
> {simplify the SCSI errors shown}
>
> (da12:mps2:0:5:0): READ(6). CDB: 8 0 0 1 1 0
> (da12:mps2:0:5:0): CAM status: SCSI Status Error
> (da12:mps2:0:5:0): SCSI status: Check Condition
> (da12:mps2:0:5:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information)
> (da12:mps2:0:5:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0
> (da12:mps2:0:5:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information)
> (da12:mps2:0:5:0): READ(10). CDB: 28 0 74 70 6d af 0 0 1 0
> (da12:mps2:0:5:0): CAM status: SCSI Status Error
> (da12:mps2:0:5:0): SCSI status: Check Condition
> (da12:mps2:0:5:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information)
> (da12:mps2:0:5:0): WRITE(6). CDB: a 0 0 0 1 0
> (da12:mps2:0:5:0): CAM status: SCSI Status Error
> (da12:mps2:0:5:0): SCSI status: Check Condition
> (da12:mps2:0:5:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information)

Based on this, we know the following:

- The da12 disk is doing something weird when it comes to reads AND
  writes.
- The da12 disk is not timing out; it receives an immediate error on
  reads and writes (coming back from the controller; whether or not the
  ATA command block makes it to the disk is unknown, but I have to
  assume it does).
- The da12 disk, at one time, was working/usable as indicated by some
  SMART attributes.
- The da12 disk is the only ST1000DL002 disk in the system.
- The da12 disk is on the same controller as 4 other disks.
- The da8 through da11 disks (WD25EZRS) on the mps2 controller are
  performing fine with no issues (I have to assume this).
- The ST1000DL002 disk is an Advanced Format disk (4096-byte sectors).
- All the WD25EZRS disks are Advanced Format disks (4096-byte sectors).
- The ST1000DL002 disk behaves badly when used on the on-board AHCI
  controller as well as a completely different motherboard (presumably).

Here's the fun part:

ATA commands being submit from the OS to the disk (specifically the
controller on the disk itself) are working fine.  SMART attributes are
obtained via an ATA command that, internally on mechanical drives,
fetches data from the HPA (Host Protected Area) region of the drive (see
Wikipedia if you don't know about this), and returns that data.  AFAIK
this data is not cached in any way, it's almost always read straight
from the HPA.

So this means we know I/O communication between the OS and controller,
and the controller and the disk, works fine.  And we also know, at least
with regards to the HPA region, that the heads can read data from the HPA
region successfully.  Great.

Could this be a controller problem (e.g. a firmware bug that affects 
compatibility with ST1000DL002 drives)?  I'm about 95% certain the
answer is no.  The reason is that the ST1000DL002 drive behaved the same
when put on other controllers.

What all this means is that the drive, in effect, refuses to read data
from non-HPA regions of the disk -- that means LBA 0 to <last LBA>.  Why
or how could this happen?  Unknown, because there's a *ton* of
possibilities -- way more than I care to speculate.  :-)

Have I seen this problem before?  Yes -- many times, but only once with
a SATA drive:

- I see this on rare occasion with Fujitsu SCSI disks at my workplace,
where the drives flat out refuse to do I/O any longer.  However, these
return a vendor-specific ASC + ASCQ that indicate the drive is in a
"locked" or "frozen" state, requiring Fujitsu to investigate.  I've seen
it happen a good 10, maybe 20 times over the past few years on drives
manufactured from 2001 to 2007.  Thankfully Fujitsu provides full docs
on their SCSI drives so I was able to look up the ASC/ASCQ and figure
out it was an internal drive failure.  We disposed of the disks
properly/securely.

- In the SATA case, the end-user's drive behaved the same as yours.  I
do not remember what brand (it really doesn't matter though).  In their
case, however, the HPA region was corrupt; the drive spit out weird
errors during SMART attribute fetch, and those attributes which it did
fetch were *completely* garbled.  My guess was a bad HPA region of the
drive, combined with either a firmware bug or something mechanical or
head problems.  The end-user RMA'd the drive and the replacement worked
fine.

My advice at this point (#1 is optional):

1. If you're curious and just interested in learning: put the
ST1000DL002 disk on a system where it's the only disk, and hooked
directly to the motherboard (and not in AHCI mode), and boot SeaTools
from a CD or USB stick.

I'm willing to bet you get back an error code on the quick/short test
(which does more than just a SMART short test).  If that does pass, try
doing a long test (which reads all the LBAs on the drive).  I'll be
very, VERY surprised if that passes.

2. File an RMA with Seagate.  The simple version is that all LBA I/O
(standard read/write) is being rejected by the drive for unknown
reasons.

Good luck, and hope this sheds some light on the "fun" (or not so fun)
world of hard disk troubleshooting.  And don't ask me to troubleshoot an
SSD.  ;-)

-- 
| Jeremy Chadwick                                 jdc@parodius.com |
| Parodius Networking                     http://www.parodius.com/ |
| UNIX Systems Administrator                 Mountain View, CA, US |
| Making life hard for others since 1977.             PGP 4BD6C0CB |