From owner-freebsd-stable@FreeBSD.ORG Sat May 17 07:50:01 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6156F106564A for ; Sat, 17 May 2008 07:50:01 +0000 (UTC) (envelope-from willy@Offermans.Rompen.nl) Received: from cpsmtpo-eml03.kpnxchange.com (cpsmtpo-eml03.kpnxchange.com [213.75.38.152]) by mx1.freebsd.org (Postfix) with ESMTP id DC2378FC18 for ; Sat, 17 May 2008 07:50:00 +0000 (UTC) (envelope-from willy@Offermans.Rompen.nl) Received: from cpsmtp-eml107.kpnxchange.com ([213.75.84.107]) by cpsmtpo-eml03.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.1830); Sat, 17 May 2008 09:49:59 +0200 Received: from koko.offrom.nl ([86.82.183.148]) by cpsmtp-eml107.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.1830); Sat, 17 May 2008 09:49:58 +0200 Received: from wiz.vpn.offrom.nl (Debian-exim@wiz.vpn.offrom.nl [10.168.0.18]) by koko.offrom.nl (8.13.8/8.13.8) with ESMTP id m4H7nr0e015281; Sat, 17 May 2008 09:49:53 +0200 (CEST) (envelope-from willy@wiz.vpn.offrom.nl) Received: from willy by wiz.vpn.offrom.nl with local (Exim 4.63) (envelope-from ) id 1JxHDL-0001qD-3K; Sat, 17 May 2008 09:52:23 +0200 Date: Sat, 17 May 2008 09:52:23 +0200 From: Willy Offermans To: Roland Smith Message-ID: <20080517075222.GA4250@wiz.vpn.offrom.nl> References: <20080421190403.GA4625@wiz.vpn.offrom.nl> <20080421201047.GB6884@slackbox.xs4all.nl> <20080516121414.GD4618@wiz.vpn.offrom.nl> <20080516190718.GA73178@slackbox.xs4all.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080516190718.GA73178@slackbox.xs4all.nl> User-Agent: Mutt/1.5.13 (2006-08-11) X-Virus-Scanned: ClamAV 0.92.1/7141/Sat May 17 03:12:19 2008 on koko.offrom.nl X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.2.4 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on koko.offrom.nl X-OriginalArrivalTime: 17 May 2008 07:49:58.0923 (UTC) FILETIME=[9AECADB0:01C8B7F2] Cc: freebsd-stable@FreeBSD.ORG Subject: Re: g_vfs_done error third part--PLEASE HELP! X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Willy@Offermans.Rompen.nl List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 May 2008 07:50:01 -0000 Hello Roland and FreeBSD friends, On Fri, May 16, 2008 at 09:07:18PM +0200, Roland Smith wrote: > On Fri, May 16, 2008 at 02:14:14PM +0200, Willy Offermans wrote: > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > /dev/ar0s1a 20308398 230438 18453290 1% / > > devfs 1 1 0 100% /dev > > /dev/ar0s1d 21321454 3814482 15801256 19% /usr > > /dev/ar0s1e 50777034 5331686 41383186 11% /var > > /dev/ar0s1f 101554150 18813760 74616058 20% /home > > /dev/ar0s1g 274977824 34564876 218414724 14% /share > > > > pretty normal I would say. > > Yes. > > > > Did you notice any file corruption in the filesystem on ar0s1g? > > > > No the two disks are brand new and I did not encounter any noticeable > > file corruption. However I assume that nowadays bad sectors on HD are > > handled by the hardware and do not need any user interaction to correct > > this. But maybe I'm totally wrong. > > Every ATA disk has spare sectors, and they usually don't report bad > blocks untill the spares are exhausted. In which case it is prudent to > replace the disk. > > > > Unmount the filesystem and run fsck(8) on it. Does it report any errors? > > > > sun# fsck /dev/ar0s1g > > ** /dev/ar0s1g > > ** Last Mounted on /share > > ** Phase 1 - Check Blocks and Sizes > > INCORRECT BLOCK COUNT I=34788357 (272 should be 264) > > CORRECT? [yn] y > > > > INCORRECT BLOCK COUNT I=34789217 (296 should be 288) > > CORRECT? [yn] y > > > > ** Phase 2 - Check Pathnames > > ** Phase 3 - Check Connectivity > > ** Phase 4 - Check Reference Counts > > ** Phase 5 - Check Cyl groups > > FREE BLK COUNT(S) WRONG IN SUPERBLK > > SALVAGE? [yn] y > > > > SUMMARY INFORMATION BAD > > SALVAGE? [yn] y > > > > BLK(S) MISSING IN BIT MAPS > > SALVAGE? [yn] y > > > > 182863 files, 17282440 used, 120206472 free (12448 frags, 15024253 > > blocks, 0.0% fragmentation) > > > > ***** FILE SYSTEM MARKED CLEAN ***** > > > > ***** FILE SYSTEM WAS MODIFIED ***** > > > > The usual stuff I would say. > > Disk corruption is never normal. It can be explained by if the machine > crashed or was power-cycles before the disks were unmounted, but it can > also indicate hardware troubles. > > > > > Any hints are very much appreciated. > > > So I have to conclude that the write error message does make sense and > > that something seems to be wrong with the disks. The next question is > > what can I do about it? Should I return the disks to the shop and ask > > for new ones? > > Install sysutils/smartmontools, and run 'smartctl -A /dev/adX|less', where X > are the numbers of the drives in the RAID array. > > In the output, look at the values for Reallocated_Sector_Ct, > Current_Pending_Sector, Offline_Uncorrectable, which is the last number > that you see on each line. > > A small number for Reallocated_Sector_Ct is allowable. But non-zero counts > for Current_Pending_Sector or Offline_Uncorrectable means it's time to > get a new disk. sun# atacontrol status ar0 ar0: ATA RAID1 status: READY subdisks: 0 ad4 ONLINE 1 ad6 ONLINE So ad4 and ad6 are the HDs of the array. sun# smartctl -A /dev/ad6 smartctl version 5.38 [i386-portbld-freebsd7.0] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 3 3 Spin_Up_Time 0x0007 100 100 015 Pre-fail Always - 7232 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 31 5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail Always - 0 8 Seek_Time_Performance 0x0025 253 253 015 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1478 10 Spin_Retry_Count 0x0033 253 253 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 253 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 31 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 439070649 187 Reported_Uncorrect 0x0032 253 253 000 Old_age Always - 0 188 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 062 060 000 Old_age Always - 38 194 Temperature_Celsius 0x0022 124 115 000 Old_age Always - 38 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 439070649 196 Reallocated_Event_Count 0x0032 253 253 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 253 253 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 253 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0 202 TA_Increase_Count 0x0032 253 253 000 Old_age Always - 0 un# smartctl -A /dev/ad4 smartctl version 5.38 [i386-portbld-freebsd7.0] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 109 3 Spin_Up_Time 0x0007 100 100 015 Pre-fail Always - 7360 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 32 5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail Always - 0 8 Seek_Time_Performance 0x0025 253 253 015 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1478 10 Spin_Retry_Count 0x0033 253 253 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 253 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 31 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 835531250 187 Reported_Uncorrect 0x0032 253 253 000 Old_age Always - 0 188 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 062 060 000 Old_age Always - 38 194 Temperature_Celsius 0x0022 124 118 000 Old_age Always - 38 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 835531250 196 Reallocated_Event_Count 0x0032 253 253 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 253 253 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 253 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0 202 TA_Increase_Count 0x0032 253 253 000 Old_age Always - 0 The critical values you have mentioned are all zero, but maybe you notice some other oddities. > > > However other people that I have contacted and who had a similar > > problem before have solved it by using software raid setup instead of a > > hardware raid setup. This seems to indicate that there is some bug in > > the FreeBSD code. > > The RAID support that you find on most desktop motherboards _is_ > software RAID. See ataraid(4). Well then read motherboard supported raid instead of hardware raid! What I meant was that Toomas noticed a similar problem and turned to gmirror to ``solve'' the issue. But somewhere is something weird going on. I'm not the first one to discover this and would be nice to nail it down, so that in the future no one has to suffer anymore from this. > > Roland > -- > R.F.Smith http://www.xs4all.nl/~rsmith/ > [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] > pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725) -- Met vriendelijke groeten, With kind regards, Mit freundlichen Gruessen, De jrus wah, Willy ************************************* W.K. Offermans Home: +31 45 544 49 44 Mobile: +31 653 27 16 23 e-mail: Willy@Offermans.Rompen.nl Powered by .... (__) \\\'',) \/ \ ^ .\._/_) www.FreeBSD.org