From owner-freebsd-stable@FreeBSD.ORG Sat Aug 20 01:14:08 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2EAD31065670 for ; Sat, 20 Aug 2011 01:14:08 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by mx1.freebsd.org (Postfix) with ESMTP id CFDC28FC13 for ; Sat, 20 Aug 2011 01:14:07 +0000 (UTC) Received: from omta01.westchester.pa.mail.comcast.net ([76.96.62.11]) by qmta12.westchester.pa.mail.comcast.net with comcast id NRCw1h0020EZKEL5CRE8ze; Sat, 20 Aug 2011 01:14:08 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta01.westchester.pa.mail.comcast.net with comcast id NRE61h0191t3BNj3MRE70T; Sat, 20 Aug 2011 01:14:08 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 4D774102C1A; Fri, 19 Aug 2011 18:14:05 -0700 (PDT) Date: Fri, 19 Aug 2011 18:14:05 -0700 From: Jeremy Chadwick To: Kevin Oberman Message-ID: <20110820011405.GA20330@icarus.home.lan> References: <1B4FC0D8-60E6-49DA-BC52-688052C4DA51@langille.org> <20110819235719.GA64220@night.db.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, Dan Langille Subject: Re: bad sector in gmirror HDD X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Aug 2011 01:14:08 -0000 On Fri, Aug 19, 2011 at 05:51:02PM -0700, Kevin Oberman wrote: > On Fri, Aug 19, 2011 at 4:57 PM, Diane Bruce wrote: > > On Fri, Aug 19, 2011 at 04:50:01PM -0400, Dan Langille wrote: > >> System in question: FreeBSD 8.2-STABLE #3: Thu Mar ?3 04:52:04 GMT 2011 > >> > >> After a recent power failure, I'm seeing this in my logs: > >> > >> Aug 19 20:36:34 bast smartd[1575]: Device: /dev/ad2, 2 Currently unreadable (pending) sectors > >> > > > > Personally, I'd replace that drive now. > > > >> Searching on that error message, I was led to believe that identifying the bad sector and > >> running dd to read it would cause the HDD to reallocate that bad block. > > > > No, as otherwise mentioned (Hi Jeremy!) you need to read and write the > > block. This could buy you a few more days or a few more weeks. Personally, > > I would not wait. Your call. > > > > While I largely agree, it depends on several factors as to whether I'd > replace the drive. > > First, what does SMART show other then these errors? If the reported > statistics look generally good, and considering that you a mirror with > one "good" copy of the blocks in question, the impact is zero unless > the other drive fails. That is why the blocks need to be re-written so > that they will be re-located on the drive. > > Second, how critical is the data? The mirror gives good integrity, but > you also need good backups. If the data MUST be on-line with high > reliability, buy a replacement drive. You need to look at cost-benefit > (or really the cost of replacement vs. cost of failure). > > It's worth mentioning that all drives have bad blocks. Most are hard > bad blocks and are re-mapped before the drive is shipped, but marginal > bad blocks can and do slip through to customers and it is entirely > possible that the drive is just fine for the most part and replacing > it is really a waste of money. > > Only you can make the call, but if further bad blocks show up in the > near term, I'll go along with recommending replacement. I can expand a bit on this. With ATA/SATA and SCSI disks, there's a factory default list of LBAs which are bad (referred to as the "physical defect list"). Everyone by now is familiar with this. With SCSI disks there's "grown defects", which is a drive-managed AND user-managed list of LBAs which are considered bad. Whether these LBAs were correctable (remapped) or not is tracked by SMART on SCSI. I can provide many examples of this if people want to see what it looks like (we have quite a collection of Fujitsu disks at my workplace. They're one of a few vendors I more or less boycott). With SCSI, you can clear the grown defect list with ease. Some drives support clearing the physical defect list too, but doing that requires a *true* low-level format to be done afterward. In the case you issue a SCSI FORMAT command, any grown defects (as the drive encounters them) will be "merged" with the physical defect list. When the FORMAT is done, the drive will report 0 grown defects. Again, I can confirm this exact behaviour with our Fujitsu disks at my workplace; it's easy to get a list of the physical and grown defects with SCSI. With ATA/SATA disks it's a different story: It seems vary from vendor to vendor and model to model. The established theory is that the drive has a list of spare LBAs for remappings, which is managed entirely by the drive itself -- and not reported back to the user via SMART or any other means. This happens entirely without user intervention, and (on repetitive errors) might show up as the drive stalling on some I/O or other oddities. These situations are not reported back to the OS either -- it's entirely 100% transparent to the user. When an ATA/SATA disk begins reporting errors back via SMART, or to the OS (e.g. I/O error), on certain LBA accesses, then the theory is that the spare LBA list used by the drive internally has been exhausted, and it will begin using a different spare list (or an extension of the existing spares; I'm not sure). What Diane's getting at (Hi Diane!) is that since the drive is already to the stage/point of reporting errors back to the OS and SMART, it means the drive has experienced problems (which it worked around) prior to this point in time. Hence her recommendation to replace the drive. What I still have a bit of trouble stomaching these days is whether or not the above theories are still used *today* in practise on SATA disks. Part of me is inclined to believe that **any** errors are reported to SMART and the OS, and the remapping is reported via SMART, etc.; e.g. there's no more "transparent" anything. The problem is that I don't have a good way to confirm/deny this. Oh what I'd give for good engineering contacts within Western Digital and Seagate... These days, I replace drives depending upon their age (Power_On_Hours) combined with how many errors are seen and what kind of errors. For example, if I have a drive that's been in operation for 20,000 hours and it now has 2 bad LBAs, I can accept that. If I have a drive that's been in operation for 48 hours and it has 30 errors, that drive is getting RMA'd. When I get new or RMA'd/refurbished drives, I test them before putting them to use. I do a read-only surface scan using SMART ("smartctl -t select,0-max /dev/XXX") and let that finish. Assuming no errors are shown in the selective scan log, I then proceed with a full disk zero ("dd if=/dev/zero of=/dev/XXX bs=64k"). When finished I check SMART for any errors. If there are any, I RMA the drive -- or if it's been RMA'd already, I get angry at the vendor. :-) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |