From owner-freebsd-stable@FreeBSD.ORG Fri Jan 25 20:59:39 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 43F8A16A417 for ; Fri, 25 Jan 2008 20:59:39 +0000 (UTC) (envelope-from jdc@parodius.com) Received: from mx01.sc1.parodius.com (mx01.sc1.parodius.com [72.20.106.3]) by mx1.freebsd.org (Postfix) with ESMTP id 320AB13C469 for ; Fri, 25 Jan 2008 20:59:38 +0000 (UTC) (envelope-from jdc@parodius.com) Received: by mx01.sc1.parodius.com (Postfix, from userid 1000) id C98C71CC038; Fri, 25 Jan 2008 12:59:38 -0800 (PST) Date: Fri, 25 Jan 2008 12:59:38 -0800 From: Jeremy Chadwick To: Chuck Swiger Message-ID: <20080125205938.GA46170@eos.sc1.parodius.com> References: <479A0731.6020405@skyrush.com> <20080125162940.GA38494@eos.sc1.parodius.com> <479A3764.6050800@skyrush.com> <3803988D-8D18-4E89-92EA-19BF62FD2395@mac.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3803988D-8D18-4E89-92EA-19BF62FD2395@mac.com> User-Agent: Mutt/1.5.16 (2007-06-09) Cc: Joe Peterson , freebsd-stable@freebsd.org Subject: Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2008 20:59:39 -0000 On Fri, Jan 25, 2008 at 12:46:08PM -0800, Chuck Swiger wrote: > On Jan 25, 2008, at 11:24 AM, Joe Peterson wrote: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED >> WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 114 071 006 Pre-fail Always >> - 82422948 > [ ... ] >> >> 7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always >> - 286126605 > [ ... ] >> 195 Hardware_ECC_Recovered 0x001a 063 046 000 Old_age Always >> - 166181300 > > These numbers are quite worrysome-- they should be zero or nearly so in a > healthy drive. On some drives, yes, but not all drives. His is a Seagate drive -- Seagate uses some of the bits in the "raw data" section for some sort of internal use by the drive firmware. So as they may appear very high in value, the drive appears to function normally, and the actual "adjusted SMART value" (the field under VALUE) doesn't fluxuate. I have Seagate drives all over the place which exhibit identical stats to the above. I've included some for comparison below; each listed is on a different system. Look at attribute 190 (Temperature Celcius) for an example; I don't think any drive can reach 773849124C, for example. Or, well, I sure hope not. :-) I believe in the case of attrib. 190, that's why they present a human-readable value in attribute 194. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ==SNIP== ad6: 476940MB at ata3-master SATA300 SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 112 094 006 Pre-fail Always - 221374987 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 6 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 200009014 9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 2967 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 9 187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0 190 Temperature_Celsius 0x0022 064 050 045 Old_age Always - 773849124 194 Temperature_Celsius 0x0022 036 050 000 Old_age Always - 36 (Lifetime Min/Max 0/29) 195 Hardware_ECC_Recovered 0x001a 066 059 000 Old_age Always - 36458075 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 18 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 18 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 ad4: 114473MB at ata2-master SATA150 SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 063 052 006 Pre-fail Always - 57703728 3 Spin_Up_Time 0x0003 096 096 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 24 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 169005025 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3536 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 24 194 Temperature_Celsius 0x0022 027 040 000 Old_age Always - 27 (Lifetime Min/Max 0/15) 195 Hardware_ECC_Recovered 0x001a 063 052 000 Old_age Always - 57703728 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 ad4: 238475MB at ata2-master SATA300 ad6: 238475MB at ata3-master SATA300 SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 108 100 006 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 18 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 070 060 030 Pre-fail Always - 11668590 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 624 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 20 187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0 190 Temperature_Celsius 0x0022 067 064 045 Old_age Always - 605421601 194 Temperature_Celsius 0x0022 033 040 000 Old_age Always - 33 (Lifetime Min/Max 0/21) 195 Hardware_ECC_Recovered 0x001a 070 060 000 Old_age Always - 231279734 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 102 091 006 Pre-fail Always - 3716759 3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 18 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 081 060 030 Pre-fail Always - 135985049 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 2186 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 20 187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0 190 Temperature_Celsius 0x0022 068 062 045 Old_age Always - 638910496 194 Temperature_Celsius 0x0022 032 040 000 Old_age Always - 32 (Lifetime Min/Max 0/21) 195 Hardware_ECC_Recovered 0x001a 072 057 000 Old_age Always - 17629155 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0