Date: Fri, 17 Feb 2012 00:58:27 -0000 From: "Steven Hartland" <killing@multiplay.co.uk> To: "Jeremy Chadwick" <freebsd@jdc.parodius.com> Cc: freebsd-stable@freebsd.org Subject: Re: ahci / ada hiding disk errors? Message-ID: <64A97EF77EAB4FBBA20960732D59761D@multiplay.co.uk> References: <1F7793659A864776BF0DD68DD3F68444@multiplay.co.uk> <20120216235252.GA59094@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- From: "Jeremy Chadwick" <freebsd@jdc.parodius.com> ... > The long test is still running, as I stated above. Also, just as a data > point: folks should remember to completely ignore the "remaining" > percentage shown -- it is hardly ever accurate, especially on Western > Digital drives. yep was aware of this just wanted to get it out there to see if anyone had any ideas, thanks for the feedback :) Long test compeleted a few hours latter and still not saying anny errors. SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 6661 - # 2 Short offline Completed without error 00% 6657 - Updated smart attributes:- SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 082 063 044 Pre-fail Always - 171321356 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 074 060 030 Pre-fail Always - 29135041 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6663 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 17 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 078 046 045 Old_age Always - 22 (Min/Max 21/23) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 12 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 17 194 Temperature_Celsius 0x0022 022 054 000 Old_age Always - 22 (0 14 0 0) 195 Hardware_ECC_Recovered 0x001a 118 099 000 Old_age Always - 171321356 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 > > I would appreciate if you could provide actual timestamps associated > with the above I/O errors from your logs. One of the kernel logs, > probably messages (or all.log if you enable it -- I HIGHLY recomment > folks do!) -- should have actual datestamps too. I'd like to see how > often these occur, or if all at once. The drive is currently totally inaccessible. Any io attempt to any area results in an error:- ad6: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA="any lba" > Otherwise I see nothing wrong with your drive. As such, I'm inclined to > believe what you're seeing is probably a bug in the ata(4) driver. I don't think this is the case as the drive has been working fine for many months until a reboot today, after which nothing seems to be able to access the data, but sees the drive fine. > Also, just for amusement value: so far in the past 7 days, this is the > *TENTH* disk-related issue I have had to look at from people on the > Internet (not just FreeBSD either). I don't know what's going on, but > you people are practically requiring me to make this a full-time job. > Hell, maybe I should start doing "consulting" on these type of things, > haha. hehe ;-) Think our next move will to have the drive reseated and if that doesn't help moved to another machine to see if that helps. Seems very odd that smart see no sign of an issue yet nothing can be accessed. Any other ideas would be greatfully recieved. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?64A97EF77EAB4FBBA20960732D59761D>