From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 08:45:15 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F2D01065670 for ; Sun, 4 Oct 2009 08:45:15 +0000 (UTC) (envelope-from patpro@patpro.net) Received: from smtpfb1-g21.free.fr (smtpfb1-g21.free.fr [212.27.42.9]) by mx1.freebsd.org (Postfix) with ESMTP id 62A6B8FC15 for ; Sun, 4 Oct 2009 08:45:12 +0000 (UTC) Received: from smtp5-g21.free.fr (smtp5-g21.free.fr [212.27.42.5]) by smtpfb1-g21.free.fr (Postfix) with ESMTP id F31AF2DFB9 for ; Sun, 4 Oct 2009 10:35:03 +0200 (CEST) Received: from smtp5-g21.free.fr (localhost [127.0.0.1]) by smtp5-g21.free.fr (Postfix) with ESMTP id E6F4BD48098 for ; Sun, 4 Oct 2009 10:34:57 +0200 (CEST) Received: from boleskine.patpro.net (boleskine.patpro.net [82.230.142.222]) by smtp5-g21.free.fr (Postfix) with ESMTP id B8980D4803E for ; Sun, 4 Oct 2009 10:34:54 +0200 (CEST) Received: from [192.168.0.2] (unknown [192.168.0.2]) by boleskine.patpro.net (Postfix) with ESMTP id 48E0D1CD5E for ; Sun, 4 Oct 2009 10:34:54 +0200 (CEST) Message-Id: From: Patrick Proniewski To: freebsd-fs@freebsd.org Content-Type: multipart/signed; boundary=Apple-Mail-18--687296030; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v936) Date: Sun, 4 Oct 2009 10:34:54 +0200 X-Mailer: Apple Mail (2.936) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Offline uncorrectable sectors on hard drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 08:45:15 -0000 --Apple-Mail-18--687296030 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Hi all, smartctl on my server returned an error yesterday : > The following warning/error was logged by the smartd daemon: > > Device: /dev/ad6, 1 Offline uncorrectable sectors /var/log/messages : > Oct 3 04:04:19 rack smartd[739]: Device: /dev/ad6, 1 Offline > uncorrectable sectors > Oct 3 04:04:19 rack smartd[739]: Device: /dev/ad6, Self-Test Log > error count increased from 0 to 1 ../.. > Oct 4 01:34:19 rack smartd[739]: Device: /dev/ad6, 1 Offline > uncorrectable sectors > Oct 4 02:04:19 rack smartd[739]: Device: /dev/ad6, 1 Offline > uncorrectable sectors first error flagged Oct 3 04:04:19, last error Oct 4 02:04:19. Since then, no more error in the logs. So my questions are: - is that "Offline uncorrectable sectors" in fact correctable? I've found howto's for extfs and reiserfs to correct this kind of error by remapping bad sectors, but nothing for UFS. - this error appears to have disappeared, does it mean the harddrive (or the fs) made the remapping by it self? Here is the smartctl -a output for this device: > # smartctl -a /dev/ad6 > smartctl version 5.38 [i386-portbld-freebsd6.4] Copyright (C) 2002-8 > Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF INFORMATION SECTION === > Model Family: Maxtor DiamondMax 10 family (ATA/133 and SATA/150) > Device Model: Maxtor 6L200M0 > Serial Number: L404EDDH > Firmware Version: BANC1E00 > User Capacity: 203,928,109,056 bytes > Device is: In smartctl database [for details use: -P show] > ATA Version is: 7 > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > Local Time is: Sun Oct 4 10:23:02 2009 CEST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > Warning! SMART Attribute Thresholds Structure error: invalid SMART > checksum. > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x02) Offline data collection > activity > was completed without error. > Auto Offline Data Collection: Disabled. > Self-test execution status: ( 0) The previous self-test > routine completed > without error or no self-test has ever > been run. > Total time to complete Offline > data collection: (1562) seconds. > Offline data collection > capabilities: (0x5b) SMART execute Offline immediate. > Auto Offline data collection on/off support. > Suspend Offline collection upon new > command. > Offline surface scan supported. > Self-test supported. > No Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before > entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > General Purpose Logging supported. > Short self-test routine > recommended polling time: ( 2) minutes. > Extended self-test routine > recommended polling time: ( 81) minutes. > SCT capabilities: (0x0021) SCT Status supported. > SCT Data Table supported. > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 3 Spin_Up_Time 0x0027 252 252 063 Pre-fail > Always - 3148 > 4 Start_Stop_Count 0x0032 253 253 000 Old_age > Always - 17 > 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail > Always - 0 > 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail > Offline - 0 > 7 Seek_Error_Rate 0x000a 253 252 000 Old_age > Always - 0 > 8 Seek_Time_Performance 0x0027 250 234 187 Pre-fail > Always - 51522 > 9 Power_On_Minutes 0x0032 154 154 000 Old_age > Always - 487h+21m > 10 Spin_Retry_Count 0x002b 252 252 157 Pre-fail > Always - 0 > 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail > Always - 0 > 12 Power_Cycle_Count 0x0032 253 253 000 Old_age > Always - 20 > 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age > Always - 0 > 193 Load_Cycle_Count 0x0032 253 253 000 Old_age > Always - 0 > 194 Temperature_Celsius 0x0032 023 253 000 Old_age > Always - 26 > 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age > Always - 7879 > 196 Reallocated_Event_Count 0x0008 251 251 000 Old_age > Offline - 2 > 197 Current_Pending_Sector 0x0008 253 253 000 Old_age > Offline - 0 > 198 Offline_Uncorrectable 0x0008 253 252 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age > Offline - 0 > 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age > Always - 0 > 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age > Always - 0 > 202 TA_Increase_Count 0x000a 253 252 000 Old_age > Always - 0 > 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail > Always - 0 > 204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age > Always - 0 > 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age > Always - 0 > 207 Spin_High_Current 0x002a 252 252 000 Old_age > Always - 0 > 208 Spin_Buzz 0x002a 252 252 000 Old_age > Always - 0 > 209 Offline_Seek_Performnce 0x0024 239 239 000 Old_age > Offline - 170 > 210 Unknown_Attribute 0x0032 253 252 000 Old_age > Always - 0 > 211 Unknown_Attribute 0x0032 253 252 000 Old_age > Always - 0 > 212 Unknown_Attribute 0x0032 253 252 000 Old_age > Always - 0 > > Warning! SMART ATA Error Log Structure error: invalid SMART checksum. > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Short offline Completed without error 00% > 34339 - > # 2 Extended offline Completed: read failure 20% > 34317 201626851 > # 3 Short offline Completed without error 00% > 34315 - > # 4 Short offline Completed without error 00% > 34291 - > # 5 Short offline Completed without error 00% > 34267 - > # 6 Short offline Completed without error 00% > 34243 - > # 7 Short offline Completed without error 00% > 34220 - > # 8 Short offline Completed without error 00% > 34196 - > # 9 Short offline Completed without error 00% > 34172 - > #10 Extended offline Completed without error 00% > 34150 - > #11 Short offline Completed without error 00% > 34148 - > #12 Short offline Completed without error 00% > 34124 - > #13 Short offline Completed without error 00% > 34100 - > #14 Short offline Completed without error 00% > 34076 - > #15 Short offline Completed without error 00% > 34053 - > #16 Short offline Completed without error 00% > 34029 - > #17 Short offline Completed without error 00% > 34005 - > #18 Extended offline Completed without error 00% > 33983 - > #19 Short offline Completed without error 00% > 33981 - > #20 Short offline Completed without error 00% > 33957 - > #21 Short offline Completed without error 00% > 33933 - > > SMART Selective self-test log data structure revision number 1 > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 0 0 Not_testing > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Not_testing > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute > delay. thanks, patpro --Apple-Mail-18--687296030-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 08:50:20 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7DBFC1065670 for ; Sun, 4 Oct 2009 08:50:20 +0000 (UTC) (envelope-from michael@fuckner.net) Received: from dedihh.fuckner.net (dedihh.fuckner.net [81.209.183.161]) by mx1.freebsd.org (Postfix) with ESMTP id 3ABA58FC1B for ; Sun, 4 Oct 2009 08:50:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by dedihh.fuckner.net (Postfix) with ESMTP id 0573F61C16; Sun, 4 Oct 2009 10:50:19 +0200 (CEST) X-Virus-Scanned: amavisd-new at fuckner.net Received: from dedihh.fuckner.net ([127.0.0.1]) by localhost (dedihh.fuckner.net [127.0.0.1]) (amavisd-new, port 10024) with SMTP id jX53UbLg7PUj; Sun, 4 Oct 2009 10:50:13 +0200 (CEST) Received: from c64.rebootking.de (e176140198.adsl.alicedsl.de [85.176.140.198]) by dedihh.fuckner.net (Postfix) with ESMTPA id 23DAB61C0F; Sun, 4 Oct 2009 10:50:13 +0200 (CEST) Message-ID: <4AC861BF.20109@fuckner.net> Date: Sun, 04 Oct 2009 10:50:07 +0200 From: Michael Fuckner User-Agent: Thunderbird 2.0.0.23 (X11/20090829) MIME-Version: 1.0 To: Patrick Proniewski References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Offline uncorrectable sectors on hard drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 08:50:20 -0000 Patrick Proniewski wrote: > Hi all, Hi Patrick, > ../.. >> Oct 4 01:34:19 rack smartd[739]: Device: /dev/ad6, 1 Offline >> uncorrectable sectors >> Oct 4 02:04:19 rack smartd[739]: Device: /dev/ad6, 1 Offline >> uncorrectable sectors > > first error flagged Oct 3 04:04:19, last error Oct 4 02:04:19. Since > then, no more error in the logs. This looks like a bad drive > So my questions are: > > - is that "Offline uncorrectable sectors" in fact correctable? I've > found howto's for extfs and reiserfs to correct this kind of error by > remapping bad sectors, but nothing for UFS. > - this error appears to have disappeared, does it mean the harddrive (or > the fs) made the remapping by it self? sector errors are on the hardware, they don't deal with the filesystem. Perhaps it was a once in a lifetime event, perhaps the drive is going to die really soon. Please do a smartctl -t long and see if the drive has to be replaced. Regards, Michael! From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 10:34:20 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 755D71065679; Sun, 4 Oct 2009 10:34:20 +0000 (UTC) (envelope-from delphij@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 4CAEF8FC24; Sun, 4 Oct 2009 10:34:20 +0000 (UTC) Received: from freefall.freebsd.org (delphij@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n94AYJmn080891; Sun, 4 Oct 2009 10:34:19 GMT (envelope-from delphij@freefall.freebsd.org) Received: (from delphij@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n94AYJJw080881; Sun, 4 Oct 2009 10:34:19 GMT (envelope-from delphij) Date: Sun, 4 Oct 2009 10:34:19 GMT Message-Id: <200910041034.n94AYJJw080881@freefall.freebsd.org> To: delphij@FreeBSD.org, freebsd-fs@FreeBSD.org, delphij@FreeBSD.org From: delphij@FreeBSD.org Cc: Subject: Re: kern/139312: [tmpfs] [patch] tmpfs mmap synchronization bug X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 10:34:20 -0000 Synopsis: [tmpfs] [patch] tmpfs mmap synchronization bug Responsible-Changed-From-To: freebsd-fs->delphij Responsible-Changed-By: delphij Responsible-Changed-When: Sun Oct 4 10:34:04 UTC 2009 Responsible-Changed-Why: Take. http://www.freebsd.org/cgi/query-pr.cgi?pr=139312 From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 11:22:25 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF82D106568B for ; Sun, 4 Oct 2009 11:22:25 +0000 (UTC) (envelope-from patpro@patpro.net) Received: from rack.patpro.net (rack.patpro.net [193.30.227.216]) by mx1.freebsd.org (Postfix) with ESMTP id 573DB8FC08 for ; Sun, 4 Oct 2009 11:22:25 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP id 7635713C; Sun, 4 Oct 2009 13:06:21 +0200 (CEST) X-Virus-Scanned: amavisd-new at patpro.net Received: from amavis-at-patpro.net ([127.0.0.1]) by localhost (rack.patpro.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cElQjk+tcCRJ; Sun, 4 Oct 2009 13:06:20 +0200 (CEST) Received: from [IPv6:::1] (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP; Sun, 4 Oct 2009 13:06:20 +0200 (CEST) Message-Id: <5AFF10D8-2B03-4DE8-83E8-20B6A832BBA8@patpro.net> From: Patrick Proniewski To: Michael Fuckner In-Reply-To: <4AC861BF.20109@fuckner.net> Content-Type: multipart/signed; boundary=Apple-Mail-6--678214530; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v936) Date: Sun, 4 Oct 2009 13:06:15 +0200 References: <4AC861BF.20109@fuckner.net> X-Mailer: Apple Mail (2.936) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: Offline uncorrectable sectors on hard drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 11:22:26 -0000 --Apple-Mail-6--678214530 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Hi Michael, On 4 oct. 09, at 10:50, Michael Fuckner wrote: >> - is that "Offline uncorrectable sectors" in fact correctable? I've >> found howto's for extfs and reiserfs to correct this kind of error >> by remapping bad sectors, but nothing for UFS. >> - this error appears to have disappeared, does it mean the >> harddrive (or the fs) made the remapping by it self? > sector errors are on the hardware, they don't deal with the > filesystem. > > Perhaps it was a once in a lifetime event, perhaps the drive is > going to die really soon. > > Please do a smartctl -t long and see if the drive has to be replaced. here is the result (from `smartctl -l selftest /dev/ad6` after the `smartctl -t long /dev/ad6` ): Warning! SMART Attribute Thresholds Structure error: invalid SMART checksum. === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 34350 - # 2 Short offline Completed without error 00% 34339 - # 3 Extended offline Completed: read failure 20% 34317 201626851 # 4 Short offline Completed without error 00% 34315 - # 5 Short offline Completed without error 00% 34291 - # 6 Short offline Completed without error 00% 34267 - # 7 Short offline Completed without error 00% 34243 - # 8 Short offline Completed without error 00% 34220 - # 9 Short offline Completed without error 00% 34196 - #10 Short offline Completed without error 00% 34172 - #11 Extended offline Completed without error 00% 34150 - #12 Short offline Completed without error 00% 34148 - #13 Short offline Completed without error 00% 34124 - #14 Short offline Completed without error 00% 34100 - #15 Short offline Completed without error 00% 34076 - #16 Short offline Completed without error 00% 34053 - #17 Short offline Completed without error 00% 34029 - #18 Short offline Completed without error 00% 34005 - #19 Extended offline Completed without error 00% 33983 - #20 Short offline Completed without error 00% 33981 - #21 Short offline Completed without error 00% 33957 - It looks like the error is gone, but may be I'm wrong. regards, patpro --Apple-Mail-6--678214530-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 11:43:00 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1CD601065670 for ; Sun, 4 Oct 2009 11:43:00 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id CBDBC8FC1D for ; Sun, 4 Oct 2009 11:42:59 +0000 (UTC) Received: from localhost (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 5E2B119E02E; Sun, 4 Oct 2009 13:42:58 +0200 (CEST) Received: from [192.168.1.2] (r5bb235.net.upc.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 1C9CC19E027; Sun, 4 Oct 2009 13:42:56 +0200 (CEST) Message-ID: <4AC88A56.6080806@quip.cz> Date: Sun, 04 Oct 2009 13:43:18 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 X-Accept-Language: cz, cs, en, en-us MIME-Version: 1.0 To: Patrick Proniewski References: <4AC861BF.20109@fuckner.net> <5AFF10D8-2B03-4DE8-83E8-20B6A832BBA8@patpro.net> In-Reply-To: <5AFF10D8-2B03-4DE8-83E8-20B6A832BBA8@patpro.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Offline uncorrectable sectors on hard drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 11:43:00 -0000 Patrick Proniewski wrote: > Hi Michael, > > On 4 oct. 09, at 10:50, Michael Fuckner wrote: > >>> - is that "Offline uncorrectable sectors" in fact correctable? I've >>> found howto's for extfs and reiserfs to correct this kind of error >>> by remapping bad sectors, but nothing for UFS. >>> - this error appears to have disappeared, does it mean the harddrive >>> (or the fs) made the remapping by it self? >> >> sector errors are on the hardware, they don't deal with the filesystem. >> >> Perhaps it was a once in a lifetime event, perhaps the drive is going >> to die really soon. >> >> Please do a smartctl -t long and see if the drive has to be replaced. > > > > here is the result (from `smartctl -l selftest /dev/ad6` after the > `smartctl -t long /dev/ad6` ): > > Warning! SMART Attribute Thresholds Structure error: invalid SMART > checksum. > === START OF READ SMART DATA SECTION === > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed without error 00% > 34350 - > # 2 Short offline Completed without error 00% > 34339 - > # 3 Extended offline Completed: read failure 20% > 34317 201626851 > # 4 Short offline Completed without error 00% [...] > It looks like the error is gone, but may be I'm wrong. It is better to look at diff of Vendor Specific Attributes (smartctl -A) You have Reallocated_Event_Count 2, so maybe bad sectors were reallocated. Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 14:50:21 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17BD51065679 for ; Sun, 4 Oct 2009 14:50:21 +0000 (UTC) (envelope-from patpro@patpro.net) Received: from rack.patpro.net (rack.patpro.net [193.30.227.216]) by mx1.freebsd.org (Postfix) with ESMTP id 8165E8FC16 for ; Sun, 4 Oct 2009 14:50:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP id 5700F13C; Sun, 4 Oct 2009 16:50:19 +0200 (CEST) X-Virus-Scanned: amavisd-new at patpro.net Received: from amavis-at-patpro.net ([127.0.0.1]) by localhost (rack.patpro.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OOaWbReyy0kQ; Sun, 4 Oct 2009 16:50:17 +0200 (CEST) Received: from [IPv6:::1] (localhost [127.0.0.1]) by rack.patpro.net (Postfix) with ESMTP; Sun, 4 Oct 2009 16:50:17 +0200 (CEST) Message-Id: <8FF3DBC3-FB26-4C02-9135-79117659FCE7@patpro.net> From: Patrick Proniewski To: Miroslav Lachman <000.fbsd@quip.cz> In-Reply-To: <4AC88A56.6080806@quip.cz> Content-Type: multipart/signed; boundary=Apple-Mail-7--664774200; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v936) Date: Sun, 4 Oct 2009 16:50:16 +0200 References: <4AC861BF.20109@fuckner.net> <5AFF10D8-2B03-4DE8-83E8-20B6A832BBA8@patpro.net> <4AC88A56.6080806@quip.cz> X-Mailer: Apple Mail (2.936) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: Offline uncorrectable sectors on hard drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 14:50:21 -0000 --Apple-Mail-7--664774200 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Miroslav, On 4 oct. 09, at 13:43, Miroslav Lachman wrote: >> It looks like the error is gone, but may be I'm wrong. > > It is better to look at diff of Vendor Specific Attributes (smartctl > -A) here we are: # smartctl -A /dev/ad6 smartctl version 5.38 [i386-portbld-freebsd6.4] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Warning! SMART Attribute Thresholds Structure error: invalid SMART checksum. === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0027 252 252 063 Pre-fail Always - 3148 4 Start_Stop_Count 0x0032 253 253 000 Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 0 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0 7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0 8 Seek_Time_Performance 0x0027 252 234 187 Pre-fail Always - 54404 9 Power_On_Minutes 0x0032 154 154 000 Old_age Always - 493h+40m 10 Spin_Retry_Count 0x002b 252 252 157 Pre-fail Always - 0 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 253 253 000 Old_age Always - 20 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0 194 Temperature_Celsius 0x0032 023 253 000 Old_age Always - 25 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 5695 196 Reallocated_Event_Count 0x0008 251 251 000 Old_age Offline - 2 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline - 0 198 Offline_Uncorrectable 0x0008 253 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 0 202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 1 204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always - 0 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0 207 Spin_High_Current 0x002a 252 252 000 Old_age Always - 0 208 Spin_Buzz 0x002a 252 252 000 Old_age Always - 0 209 Offline_Seek_Performnce 0x0024 239 239 000 Old_age Offline - 170 210 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 211 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 212 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0 thanks, patpro --Apple-Mail-7--664774200-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 17:48:00 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E48D1065676 for ; Sun, 4 Oct 2009 17:48:00 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 4C1B28FC1B for ; Sun, 4 Oct 2009 17:47:58 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id ED85245E91; Sun, 4 Oct 2009 19:47:56 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 6298245E87; Sun, 4 Oct 2009 19:47:48 +0200 (CEST) Date: Sun, 4 Oct 2009 19:47:47 +0200 From: Pawel Jakub Dawidek To: Solon Lutz Message-ID: <20091004174746.GF1660@garage.freebsd.pl> References: <683849754.20091001110503@pyro.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="CXFpZVxO6m2Ol4tQ" Content-Disposition: inline In-Reply-To: <683849754.20091001110503@pyro.de> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.2 required=4.5 tests=BAYES_00,PLING_QUERY, RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Help needed! ZFS I/O error recovery? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 17:48:00 -0000 --CXFpZVxO6m2Ol4tQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Oct 01, 2009 at 11:05:03AM +0200, Solon Lutz wrote: > Hi erverybody, >=20 > I'm faced with a 10TB ZFS pool on a 12TB RAID6 Areca controller. > And yes, I know, you shouldn't put a zpool on a RAID-device... =3D( Just to be sure: you have no redundancy on ZFS level at all? That's very, very bad idea for important data (you know that already, but to warn others)... > The cable was replaced, a parity check was run on the RAID-Volume and > showed no errors, the zfs scrub however showed some 'defective' files. > After copying these files with 'dd -conv=3Dnoerror...' and comparing them > to the originals, they were error-free. >=20 > Yesterday however, three more defective cables forced the controller > to take the RAID6 volume offline. Now all cables were replaced and a pari= ty > check was run on the RAID-Volume -> data integrity OK. This means absolutely nothing. It just means that parity match the actual data, it doesn't mean the data is fine from file system or application perspective. > But now ZFS refuses to mount all volumes: >=20 > Solaris: WARNING: can't process intent log for temp/space1 > Solaris: WARNING: can't process intent log for temp/space2 > Solaris: WARNING: can't process intent log for temp/space3 > Solaris: WARNING: can't process intent log for temp/space4 >=20 > A scrub revealed to following: >=20 > errors: Permanent errors have been detected in the following files: >=20 > temp:<0x0> > temp/space1:<0x0> > temp/space2:<0x0> > temp/space3:<0x0> > temp/space4:<0x0> >=20 >=20 > I tried to switch off checksums for this pool, but that didn't help in any > way. I also mounted the pool by hand and was faced with with 'empty' volu= mes > and 'I/O errors' when trying to list their contents... >=20 > Any suggestions? I'm offering some self-made blackberry jam and raspberry= brandy > to the person who can help to restore or backup the data. >=20 > Tech specs: >=20 > FreeBSD 7.2-STABLE #21: Tue May 5 18:44:10 CEST 2009 (AMD64) > da0 at arcmsr0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-5 device > da0: 166.666MB/s transfers (83.333MHz DT, offset 32, 16bit) > da0: Command Queueing Enabled > da0: 10490414MB (21484367872 512 byte sectors: 255H 63S/T 1337340C) > ZFS filesystem version 6 > ZFS storage pool version 6 If you are able to backup your disks, do it before we go further. I've some ideas, but they can mess up your data even further. First of all I'd start with upgrading system to stable/8, there could be better error recovery. Do not write anything new to the pool, actually do not even read from it as it may trigger writting as well. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --CXFpZVxO6m2Ol4tQ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKyN/CForvXbEpPzQRApwSAJ9BsZk3v4YCjhbbKgjcfQPxpGM3ewCfbQdf 0kw0+VtPZlxxmdHP1WJSB+0= =hpWC -----END PGP SIGNATURE----- --CXFpZVxO6m2Ol4tQ-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 20:05:09 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CB8BB1065672 for ; Sun, 4 Oct 2009 20:05:09 +0000 (UTC) (envelope-from aaron@goflexitllc.com) Received: from mail.goflexitllc.com (mail.goflexitllc.com [70.38.81.12]) by mx1.freebsd.org (Postfix) with ESMTP id 69D208FC1D for ; Sun, 4 Oct 2009 20:05:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=goflexitllc.com; h=message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=zeta; bh=ld67HCtbNCV+3Pfk/oBxovT1ev o=; b=iWy1Xz1KW0Kz6boYaBTN4x4VWgFj45S/2/9H0UY5PuHXEIbI/vIRW+8FP9 OTE1gkDctUSGD23SIFyp35+CMXA+0L/Hfo8HBybRyMXV4QiSKSKJ0ZJLUIdDTmky smhWeN DomainKey-Signature: a=rsa-sha1; c=nofws; d=goflexitllc.com; h= message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=zeta; b=kqYUO3GEwr9yimW1+UGr dtafnk8HTyz+LuNfhG79ZOjyDqg9YYV0dyca/SZj8N1GGK0lFtWaxupHJk6P9R9r ZvP+A5nHEL4R4zLghSCqc4dm8fwytn+hScUkYSYDbD1p Received: (qmail 64088 invoked by uid 89); 4 Oct 2009 20:05:06 -0000 Received: (simscan 1.4.1 ppid 64077 pid 64085 t 0.3147s) (scanners: regex: 1.4.1 attach: 1.4.1 clamav: 0.95.2/m:51/d:9840); 04 Oct 0109 20:05:06 -0000 Received: from temp4.wavelinx.net (HELO ?172.16.1.128?) (aaron@goflexitllc.com@69.27.151.4) by mail.goflexitllc.com with ESMTPA; 4 Oct 2009 20:05:06 -0000 Message-ID: <4AC8FFE9.90606@goflexitllc.com> Date: Sun, 04 Oct 2009 15:04:57 -0500 From: Aaron Hurt User-Agent: Thunderbird 2.0.0.22 (X11/20090719) MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <683849754.20091001110503@pyro.de> <20091004174746.GF1660@garage.freebsd.pl> In-Reply-To: <20091004174746.GF1660@garage.freebsd.pl> Content-Type: multipart/mixed; boundary="------------040508080101090601010809" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: Help needed! ZFS I/O error recovery? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 20:05:09 -0000 This is a multi-part message in MIME format. --------------040508080101090601010809 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Pawel Jakub Dawidek wrote: > On Thu, Oct 01, 2009 at 11:05:03AM +0200, Solon Lutz wrote: > >> Hi erverybody, >> >> I'm faced with a 10TB ZFS pool on a 12TB RAID6 Areca controller. >> And yes, I know, you shouldn't put a zpool on a RAID-device... =( >> > > Just to be sure: you have no redundancy on ZFS level at all? That's > very, very bad idea for important data (you know that already, but to > warn others)... > > >> The cable was replaced, a parity check was run on the RAID-Volume and >> showed no errors, the zfs scrub however showed some 'defective' files. >> After copying these files with 'dd -conv=noerror...' and comparing them >> to the originals, they were error-free. >> >> Yesterday however, three more defective cables forced the controller >> to take the RAID6 volume offline. Now all cables were replaced and a parity >> check was run on the RAID-Volume -> data integrity OK. >> > > This means absolutely nothing. It just means that parity match the > actual data, it doesn't mean the data is fine from file system or > application perspective. > > >> But now ZFS refuses to mount all volumes: >> >> Solaris: WARNING: can't process intent log for temp/space1 >> Solaris: WARNING: can't process intent log for temp/space2 >> Solaris: WARNING: can't process intent log for temp/space3 >> Solaris: WARNING: can't process intent log for temp/space4 >> >> A scrub revealed to following: >> >> errors: Permanent errors have been detected in the following files: >> >> temp:<0x0> >> temp/space1:<0x0> >> temp/space2:<0x0> >> temp/space3:<0x0> >> temp/space4:<0x0> >> >> >> I tried to switch off checksums for this pool, but that didn't help in any >> way. I also mounted the pool by hand and was faced with with 'empty' volumes >> and 'I/O errors' when trying to list their contents... >> >> Any suggestions? I'm offering some self-made blackberry jam and raspberry brandy >> to the person who can help to restore or backup the data. >> >> Tech specs: >> >> FreeBSD 7.2-STABLE #21: Tue May 5 18:44:10 CEST 2009 (AMD64) >> da0 at arcmsr0 bus 0 target 0 lun 0 >> da0: Fixed Direct Access SCSI-5 device >> da0: 166.666MB/s transfers (83.333MHz DT, offset 32, 16bit) >> da0: Command Queueing Enabled >> da0: 10490414MB (21484367872 512 byte sectors: 255H 63S/T 1337340C) >> ZFS filesystem version 6 >> ZFS storage pool version 6 >> > > If you are able to backup your disks, do it before we go further. I've > some ideas, but they can mess up your data even further. > > First of all I'd start with upgrading system to stable/8, there could be > better error recovery. > > Do not write anything new to the pool, actually do not even read from it > as it may trigger writting as well. > > I am experiencing a similar issue with a small box here at the house. It is not on a raid controller, just a standard 4 port non-raid. It is also giving an I/O error unable to import message. This started happening after a situation similar to the above where I had a drive going bad that started giving dma read/write errors and causing the machine to lock up...not panic or crash just freeze...so I just turned the machine off until I had time to backup the data and move it to a new array. However, when I went to do that to this particular raidz1 pool showed faulted and had a status message about corrupt meta data. I hoped I could export/import that pool to get it back in a readable state. That didn't work, the array exported fine without error but now refuses to import saying I/O error unable to import. Long story made short, I would also be very appreciative of any ZFS related data recovery information or processes. -- Aaron Hurt Managing Partner Flex I.T., LLC 611 Commerce Street Suite 3117 Nashville, TN 37203 Phone: 615.438.7101 E-mail: aaron@goflexitllc.com --------------040508080101090601010809-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 20:35:22 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 727AE106568B; Sun, 4 Oct 2009 20:35:22 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yx0-f171.google.com (mail-yx0-f171.google.com [209.85.210.171]) by mx1.freebsd.org (Postfix) with ESMTP id 187618FC12; Sun, 4 Oct 2009 20:35:21 +0000 (UTC) Received: by yxe1 with SMTP id 1so2670695yxe.3 for ; Sun, 04 Oct 2009 13:35:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=lsn8KKGwu65VWoUxrNKSCe0tOVu/AckQp/IgvUShyeI=; b=cuY2zHo3g2oqRYSSw30/EWYmK8xgk+FdclGREmCjq4jXcBECCpbLSwiderzAu7EJa7 DBK4mGt6RGvDT1XUgzL9QTzb+iuOOL92G6jr/xjEEzCoN/5QhSBIJ+H/1IMMgOoKNCYk 6pR6kPfv4sq5Gsf9IFQ9t+yBMUyXOkuvSe/Qw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=bsobGdF1f/1bN+/uSMTRtkmXhSXOD1J9+jk3U4vLo6TRgjuTCsoc5czXVdhpy4aGV9 k3v8FW35CqpCfKRV42b+HL+Hl7Rk26TL7Gqetf9X8CQwRZrjQcLpo222NVFZ6rZBRg6E Hzt6RQwS9gJx9sU27UGwbmD4UkeUQyF3RVGR8= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.90.217.3 with SMTP id p3mr3062775agg.22.1254688521311; Sun, 04 Oct 2009 13:35:21 -0700 (PDT) In-Reply-To: <4AC8FFE9.90606@goflexitllc.com> References: <683849754.20091001110503@pyro.de> <20091004174746.GF1660@garage.freebsd.pl> <4AC8FFE9.90606@goflexitllc.com> Date: Sun, 4 Oct 2009 13:35:21 -0700 X-Google-Sender-Auth: b65ceea0daf30f10 Message-ID: From: Artem Belevich To: Aaron Hurt Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: Help needed! ZFS I/O error recovery? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 20:35:22 -0000 Similar issue has been discussed on various zfs-related lists. Following post contains number of useful links on the issue itself and the ways to recover. http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26704.html --Artem From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 21:22:44 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E19F81065707 for ; Sun, 4 Oct 2009 21:22:43 +0000 (UTC) (envelope-from aaron@goflexitllc.com) Received: from mail.goflexitllc.com (mail.goflexitllc.com [70.38.81.12]) by mx1.freebsd.org (Postfix) with ESMTP id 6E35A8FC16 for ; Sun, 4 Oct 2009 21:22:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=goflexitllc.com; h=message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=zeta; bh=oNEbOcTSndoEiixECZvuGfKqDb k=; b=WJhJEEbOQR2Mi2OrX9qaM7E6uqN4kceaVvScaWU7gM1MttrUBJWOvKclWZ nToe/vWCcFl8yv7l7WWOzC7PzGi7inTsdCXw5E6lOKFJD12UAUvsGACIG/cojmjD bueV0V DomainKey-Signature: a=rsa-sha1; c=nofws; d=goflexitllc.com; h= message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=zeta; b=DdQFd1zrEbDIEzkrAY3I 1i4nIEEjf5Nkw3ysnl8gRKnmfFXS9XTOewSB2WggPHX14uuRO6aAX3X887THRD/g JuOJHSlGRr+W15i2G3MXW30B8Yxby0yMrZDiIQdo6plh Received: (qmail 65227 invoked by uid 89); 4 Oct 2009 21:22:40 -0000 Received: (simscan 1.4.1 ppid 65219 pid 65224 t 0.6505s) (scanners: regex: 1.4.1 attach: 1.4.1 clamav: 0.95.2/m:51/d:9840); 04 Oct 0109 21:22:39 -0000 Received: from temp4.wavelinx.net (HELO ?172.16.1.128?) (aaron@goflexitllc.com@69.27.151.4) by mail.goflexitllc.com with ESMTPA; 4 Oct 2009 21:22:39 -0000 Message-ID: <4AC91216.9070200@goflexitllc.com> Date: Sun, 04 Oct 2009 16:22:30 -0500 From: Aaron Hurt User-Agent: Thunderbird 2.0.0.22 (X11/20090719) MIME-Version: 1.0 To: Artem Belevich References: <683849754.20091001110503@pyro.de> <20091004174746.GF1660@garage.freebsd.pl> <4AC8FFE9.90606@goflexitllc.com> In-Reply-To: Content-Type: multipart/mixed; boundary="------------050008030608050201080905" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: Help needed! ZFS I/O error recovery? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 21:22:44 -0000 This is a multi-part message in MIME format. --------------050008030608050201080905 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Artem Belevich wrote: > Similar issue has been discussed on various zfs-related lists. > > Following post contains number of useful links on the issue itself and > the ways to recover. > http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26704.html > > --Artem > Thank you for the links, I read through them and read victor's post...however, I am not even able to get the command he mentioned to start on my particular machine: schroeder# zdb -bbccsv 5666150306755460943 zdb: can't open 5666150306755460943: Input/output error I do have some more hope than before after reading those threads, but I am unsure of how to proceed past this point. Not sure if it will help but here is the zdb -ll for all the devices in this pool. schroeder# zdb -ll /dev/ad4s1 -------------------------------------------- LABEL 0 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14725838678957352699 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 1 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14725838678957352699 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 2 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14725838678957352699 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 3 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14725838678957352699 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 schroeder# zdb -ll /dev/ad6s1 -------------------------------------------- LABEL 0 -------------------------------------------- version=13 name='tank0' state=0 txg=478204 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14258118189710079071 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 1 -------------------------------------------- version=13 name='tank0' state=0 txg=478204 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14258118189710079071 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 2 -------------------------------------------- version=13 name='tank0' state=0 txg=478204 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14258118189710079071 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 3 -------------------------------------------- version=13 name='tank0' state=0 txg=478204 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=14258118189710079071 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 schroeder# zdb -ll /dev/ad8s1 -------------------------------------------- LABEL 0 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=3285817555281532863 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 1 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=3285817555281532863 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 2 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=3285817555281532863 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 3 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=3285817555281532863 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 schroeder# zdb -ll /dev/ad10s1 -------------------------------------------- LABEL 0 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=4760880554491391388 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 1 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=4760880554491391388 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 2 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=4760880554491391388 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 -------------------------------------------- LABEL 3 -------------------------------------------- version=13 name='tank0' state=0 txg=478240 pool_guid=5666150306755460943 hostid=2440235366 hostname='unset' top_guid=2037719367177844310 guid=4760880554491391388 vdev_tree type='raidz' id=0 guid=2037719367177844310 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=1600332496896 is_log=0 children[0] type='disk' id=0 guid=14725838678957352699 path='/dev/ad4s1' whole_disk=0 DTL=323 children[1] type='disk' id=1 guid=14258118189710079071 path='/dev/ad6s1' whole_disk=0 DTL=327 removed=1 children[2] type='disk' id=2 guid=3285817555281532863 path='/dev/ad8s1' whole_disk=0 DTL=326 children[3] type='disk' id=3 guid=4760880554491391388 path='/dev/ad10s1' whole_disk=0 DTL=325 schroeder# Thank You, Aaron --------------050008030608050201080905-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 21:23:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40C581065692 for ; Sun, 4 Oct 2009 21:23:49 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 80EEB8FC18 for ; Sun, 4 Oct 2009 21:23:48 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 3C56F45EE6; Sun, 4 Oct 2009 23:23:46 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 3B2E245B36; Sun, 4 Oct 2009 23:23:41 +0200 (CEST) Date: Sun, 4 Oct 2009 23:23:39 +0200 From: Pawel Jakub Dawidek To: Aaron Hurt Message-ID: <20091004212339.GK1660@garage.freebsd.pl> References: <683849754.20091001110503@pyro.de> <20091004174746.GF1660@garage.freebsd.pl> <4AC8FFE9.90606@goflexitllc.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5mZBmBd1ZkdwT1ny" Content-Disposition: inline In-Reply-To: <4AC8FFE9.90606@goflexitllc.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.2 required=4.5 tests=BAYES_00,PLING_QUERY, RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Help needed! ZFS I/O error recovery? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 21:23:49 -0000 --5mZBmBd1ZkdwT1ny Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Oct 04, 2009 at 03:04:57PM -0500, Aaron Hurt wrote: > I am experiencing a similar issue with a small box here at the house. =20 > It is not on a raid controller, just a standard 4 port non-raid. It is= =20 > also giving an I/O error unable to import message. This started=20 > happening after a situation similar to the above where I had a drive=20 > going bad that started giving dma read/write errors and causing the=20 > machine to lock up...not panic or crash just freeze...so I just turned=20 > the machine off until I had time to backup the data and move it to a new= =20 > array. However, when I went to do that to this particular raidz1 pool=20 > showed faulted and had a status message about corrupt meta data. I=20 > hoped I could export/import that pool to get it back in a readable=20 > state. That didn't work, the array exported fine without error but now= =20 > refuses to import saying I/O error unable to import. Long story made=20 > short, I would also be very appreciative of any ZFS related data=20 > recovery information or processes. For starters I'd need FreeBSD version you use, 'zpool import' output, 'zdb -l /dev/' output for each pool component. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --5mZBmBd1ZkdwT1ny Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKyRJbForvXbEpPzQRAtySAJ45Hw6AssVEp7IEwLKwfRGmztLD5ACgyVeY WtwSXUeexRjqNwMwokipQQs= =zZDz -----END PGP SIGNATURE----- --5mZBmBd1ZkdwT1ny-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 4 21:36:13 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 331841065679 for ; Sun, 4 Oct 2009 21:36:13 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 5AA228FC0A for ; Sun, 4 Oct 2009 21:36:12 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id D031245EB2; Sun, 4 Oct 2009 23:36:10 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 5E22745F6C; Sun, 4 Oct 2009 23:36:05 +0200 (CEST) Date: Sun, 4 Oct 2009 23:36:04 +0200 From: Pawel Jakub Dawidek To: Aaron Hurt Message-ID: <20091004213604.GL1660@garage.freebsd.pl> References: <683849754.20091001110503@pyro.de> <20091004174746.GF1660@garage.freebsd.pl> <4AC8FFE9.90606@goflexitllc.com> <4AC91216.9070200@goflexitllc.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="FnOKg9Ah4tDwTfQS" Content-Disposition: inline In-Reply-To: <4AC91216.9070200@goflexitllc.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.2 required=4.5 tests=BAYES_00,PLING_QUERY, RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Help needed! ZFS I/O error recovery? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 21:36:13 -0000 --FnOKg9Ah4tDwTfQS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Oct 04, 2009 at 04:22:30PM -0500, Aaron Hurt wrote: > schroeder# zdb -ll /dev/ad4s1 [...] > txg=3D478240 [...] > schroeder# zdb -ll /dev/ad6s1 [...] > txg=3D478204 [...] > schroeder# zdb -ll /dev/ad8s1 [...] > txg=3D478240 [...] > schroeder# zdb -ll /dev/ad10s1 [...] > txg=3D478240 As you can see one of your vdevs (ad6s1) has lower transaction group number than the others. The difference is quite big (36 uberblocks), so we may not be able to go that far in the past. It might be that ZFS doesn't want to use ad6s1, because it is not up-to-date and there are some errors on one of the other slices. If you have you data backed up you may try this patch: http://people.freebsd.org/~pjd/patches/vdev_label.c.patch Once you run ZFS with the patch you can try setting sysctl vfs.zfs.maxtxg to 478204 and try importing your pool. It will try to import the pool on txg from ad6s1. It won't work if the other slices don't have this uberlock anymore or some earlier data is already overwritten. Do it on your own risk, as it might mess up your data even further. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --FnOKg9Ah4tDwTfQS Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKyRVEForvXbEpPzQRAm/ZAJ4v6u1A6ydJeJqLaRGQQErju0vxnACgueLf w+HiK8I/HrqCK2O+CezHf7k= =jZW9 -----END PGP SIGNATURE----- --FnOKg9Ah4tDwTfQS-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 07:24:19 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EF439106568F; Mon, 5 Oct 2009 07:24:18 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 057CA8FC15; Mon, 5 Oct 2009 07:24:17 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 79DEC1371A2; Mon, 5 Oct 2009 09:24:16 +0200 (CEST) X-CRM114-Version: 20090423-BlameSteveJobs ( TRE 0.7.6 (BSD) ) MF-ACE0E1EA [pR: 17.0389] X-CRM114-CacheID: sfid-20091005_09241_A1B3987E X-CRM114-Status: Good ( pR: 17.0389 ) Message-ID: <4AC99F1D.3040300@fsn.hu> Date: Mon, 05 Oct 2009 09:24:13 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> In-Reply-To: <20091002184526.GA1660@garage.freebsd.pl> X-Stationery: 0.4.10 Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.3 (people.fsn.hu); Mon, 05 Oct 2009 09:24:15 +0200 (CEST) Cc: freebsd-fs@FreeBSD.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 07:24:19 -0000 On 10/02/09 20:45, Pawel Jakub Dawidek wrote: > On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: > >> Backing out this change from the 8-STABLE kernel: >> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 >> >> makes it survive about half and hour of IMAP searching. Of course only >> time will tell whether this helps in the long run, but so far 10/10 >> tries succeeded to kill the machine with this method... >> > > Could you try this patch: > > http://people.freebsd.org/~pjd/patches/arc.c.4.patch > Sure. But before that, a report with the above modification: the machine has survived some days, then started to behave strangely. Meaning I could ping it, I could log in to the IMAP service (running from ZFS), read some mails, but not all. I could not access it via ssh (which runs from UFS), but an already running top from a different session was alive. It showed: last pid: 11272; load averages: 0.00, 0.00, 0.00 up 3+15:21:13 09:11:43 149 processes: 1 running, 143 sleeping, 1 zombie, 4 waiting CPU: 0.0% user, 0.0% nice, 0.2% system, 0.0% interrupt, 99.8% idle Mem: 234M Active, 197M Inact, 559M Wired, 111M Buf, 440K Free Swap: 4096M Total, 976K Used, 4095M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 78492 root 1 44 0 4700K 2156K CPU1 1 5:37 0.00% top 92343 root 1 44 0 4132K 1576K nanslp 1 4:12 0.00% gstat 13401 root 1 44 0 1528K 456K piperd 0 2:19 0.00% readproctitl 12679 root 1 44 0 3932K 1236K vmwait 1 2:12 0.00% zpool 35988 125 4 45 0 16892K 5968K sigwai 0 1:53 0.00% milter-greyl 25656 root 1 45 0 1536K 564K getblk 0 1:45 0.00% supervise 25798 root 1 44 0 1536K 564K vmwait 0 1:44 0.00% supervise 28406 root 1 44 0 1536K 544K vmwait 0 1:43 0.00% supervise 30226 root 1 44 0 1536K 544K vmwait 0 1:43 0.00% supervise 35401 root 1 44 0 1536K 544K vmwait 0 1:42 0.00% supervise 29203 root 1 44 0 1536K 544K vmwait 0 1:42 0.00% supervise 21629 389 6 44 0 91664K 41892K ucond 0 1:02 0.00% slapd 72283 60 1 44 0 80972K 1948K select 1 0:34 0.00% idled 98960 root 1 44 0 9396K 2544K select 1 0:32 0.00% sshd 1550 root 1 44 0 3340K 940K vmwait 1 0:32 0.00% syslogd 5463 125 1 44 0 6924K 2036K vmwait 0 0:27 0.00% qmgr 54193 root 1 44 0 9396K 2516K select 0 0:22 0.00% sshd I could not log into the console, it didn't even gave a "user name" filed after hitting enter. Strange. I will try the patch. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 11:06:51 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 98AA9106568D for ; Mon, 5 Oct 2009 11:06:51 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6392D8FC17 for ; Mon, 5 Oct 2009 11:06:51 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n95B6pIP088654 for ; Mon, 5 Oct 2009 11:06:51 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n95B6oeR088652 for freebsd-fs@FreeBSD.org; Mon, 5 Oct 2009 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 5 Oct 2009 11:06:50 GMT Message-Id: <200910051106.n95B6oeR088652@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 11:06:51 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] [patch] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b f usb/112640 fs [ext2fs] [hang] Kernel freezes when writing a file to o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 135 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 15:00:18 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F92A1065696 for ; Mon, 5 Oct 2009 15:00:18 +0000 (UTC) (envelope-from cowens@greatbaysoftware.com) Received: from portcityhosting.com (bayringfw.portcityweb.com [64.140.243.92]) by mx1.freebsd.org (Postfix) with ESMTP id BC09B8FC0A for ; Mon, 5 Oct 2009 15:00:17 +0000 (UTC) Received: from [127.0.0.1] ([173.14.128.81]) by portcityhosting.com with MailEnable ESMTP; Mon, 5 Oct 2009 10:20:19 -0400 X-WatchGuard-Mail-Exception: Allow Message-ID: <4ACA015D.3090800@greatbaysoftware.com> Date: Mon, 05 Oct 2009 10:23:25 -0400 From: Charles Owens MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: multipart/mixed; boundary="------------070607070806010207090300" X-WatchGuard-AntiVirus: part scanned. clean action=allow X-ME-Bayesian: 0.000000 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 15:00:18 -0000 This is a multi-part message in MIME format. --------------070607070806010207090300 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-WatchGuard-AntiVirus: part scanned. clean action=allow Hello folks, We've had a system crash, apparently related to GEOM_JOURNAL, on an i386 system running 7.0-RELEASE-p11. Here's what we could see on the screen (formatted for readability): GEOM_JOURNAL: [copy] Error while writting data (error=6) \ ad4s1a[WRITE(offset=43561402368, length=16384)] GEOM_JOURNAL: [copy] Error while writting data (error=6) \ ad4s1a[WRITE(offset=48868164096, length=16896)] GEOM_JOURNAL: Error while reading data from ad4s1a (error=6). mode=0134172, inum=5323776, fs = / panic: ffs_valloc: dup alloc cpuid = 3 Uptime: 119d10h5m43s Cannot dump. No dump device defined. GEOM_JOURNAL: Flush of cache of ad4s1a: error=6. GEOM_JOURNAL: [flush] Error while writting data (error=6) \ ad4s1a[WRITE(offset=48868197888, length=98816)] (4 more lines like last one) Rebooting... cpu_reset: Stopping other CPUs The system didn't actually reboot.. just got stuck there. When it was eventually manually rebooted, it booted just fine. Any thoughts as to what could be the real problem? What does "error=6" indicate? I've done some scouring of the net and found something that may not directly relate to this crash... but does relate, at least, to my filesystem configuration. One of the threads: http://markmail.org/message/tamo4r2jho3zdv3z In the described crash, similar error messages were seen, but with "error=1". Ultimately Pawel Dawidek (gjournal author) gave the diagnosis that the crash was related to the first filesystem in the slice being set up with an offset of zero, not the correct offset of 16. Either in this thread or elsewhere I also learned that sysinstall always uses the zero offset... even though it is not best practice. Not a happy discovery. Looking at our system that crashed... sure enough, zero offset (see label below below -- both 'a' and 'd' are journaled). So this then prompts two questions: * Can our crash be explained by the zero offset filesystem configuration? * If not, separate from the crash, how much should we be worried about running a system with gjournal like this? Thanks very much for any and all assistance, Charles # bsdlabel ad4s1 # /dev/ad4s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 77594624 0 4.2BSD 2048 16384 28552 b: 24113088 77594624 swap c: 156296322 0 unused 0 0 # "raw" part, don't edit d: 54588610 101707712 4.2BSD 2048 16384 28552 -- **Charles Owens** *Great Bay Software**|** ** e: *cowens@GreatBaySoftware.com**** --------------070607070806010207090300-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 15:51:36 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3266C106566B; Mon, 5 Oct 2009 15:51:36 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yw0-f204.google.com (mail-yw0-f204.google.com [209.85.211.204]) by mx1.freebsd.org (Postfix) with ESMTP id C7B228FC08; Mon, 5 Oct 2009 15:51:35 +0000 (UTC) Received: by ywh42 with SMTP id 42so2540968ywh.28 for ; Mon, 05 Oct 2009 08:51:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=NhAOZf4zBczyhhG7nqHkyLvvGiDYj5bpF1afs2RvOXs=; b=IEU5FUGUpJz+SqyDN32X1kzWXGxPIL1Bjm6QgxRER0Tv6mrvn/oa0DNzjATDEWYDL2 z8F5qc0gzdmGXYfC4jX/Vi5JSrhRdzAdZLYFvYLmivb00apmDdj3tbThEp5rZ7TX5eNj JoQ19jc+6rcPC95hNqOh4cem8Nlcs46PUU8rI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=fdeVjLeS6Th8OhOTIi3ush1dPUjLzc5BXLEz6ZCtwJTkcRpH68lF9+gbUXHCwfQW+F dUKGhzauadelHyU4expzXQ2WetNryuuCmz2efeJRy+pPorbJCUcNgFCkZI+sVjCmRyYd s2DI+eSUJZMDrIYJeQEfX/gUnpgmQCBa5Oc+Y= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.90.22.18 with SMTP id 18mr88708agv.20.1254757894250; Mon, 05 Oct 2009 08:51:34 -0700 (PDT) In-Reply-To: <4AC99F1D.3040300@fsn.hu> References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> <4AC99F1D.3040300@fsn.hu> Date: Mon, 5 Oct 2009 08:51:34 -0700 X-Google-Sender-Auth: 4902a9a24d034a97 Message-ID: From: Artem Belevich To: Attila Nagy Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 15:51:36 -0000 Your lockup is very similar (processes stuck sleeping on vmwait) to what I had when arc_min was set too high. With Pawel's patch ZFS would not give up any memory above arc_min. Try bringing vfs.zfs.arc_min down. --Artem 2009/10/5 Attila Nagy : > On 10/02/09 20:45, Pawel Jakub Dawidek wrote: >> >> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: >> >>> >>> Backing out this change from the 8-STABLE kernel: >>> >>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/ut= s/common/fs/zfs/arc.c?r1=3D191901&r2=3D191902 >>> >>> makes it survive about half and hour of IMAP searching. Of course only >>> time will tell whether this helps in the long run, but so far 10/10 tri= es >>> succeeded to kill the machine with this method... >>> >> >> Could you try this patch: >> >> =A0 =A0 =A0 =A0http://people.freebsd.org/~pjd/patches/arc.c.4.patch >> > > Sure. But before that, a report with the above modification: the machine = has > survived some days, then started to behave strangely. Meaning I could pin= g > it, I could log in to the IMAP service (running from ZFS), read some mail= s, > but not all. > I could not access it via ssh (which runs from UFS), but an already runni= ng > top from a different session was alive. It showed: > last pid: 11272; =A0load averages: =A00.00, =A00.00, =A00.00 =A0 =A0up 3+= 15:21:13 > =A009:11:43 > 149 processes: 1 running, 143 sleeping, 1 zombie, 4 waiting > CPU: =A00.0% user, =A00.0% nice, =A00.2% system, =A00.0% interrupt, 99.8%= idle > Mem: 234M Active, 197M Inact, 559M Wired, 111M Buf, 440K Free > Swap: 4096M Total, 976K Used, 4095M Free > > =A0PID USERNAME =A0THR PRI NICE =A0 SIZE =A0 =A0RES STATE =A0 C =A0 TIME = =A0 WCPU COMMAND > 78492 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A04700K =A02156K CPU1 =A0 =A0= 1 =A0 5:37 =A00.00% top > 92343 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A04132K =A01576K nanslp =A01 = =A0 4:12 =A00.00% gstat > 13401 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01528K =A0 456K piperd =A00 = =A0 2:19 =A00.00% > readproctitl > 12679 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A03932K =A01236K vmwait =A01 = =A0 2:12 =A00.00% zpool > 35988 =A0 =A0125 =A0 =A0 =A04 =A045 =A0 =A00 16892K =A05968K sigwai =A00 = =A0 1:53 =A00.00% > milter-greyl > 25656 root =A0 =A0 =A0 =A01 =A045 =A0 =A00 =A01536K =A0 564K getblk =A00 = =A0 1:45 =A00.00% supervise > 25798 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 564K vmwait =A00 = =A0 1:44 =A00.00% supervise > 28406 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 544K vmwait =A00 = =A0 1:43 =A00.00% supervise > 30226 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 544K vmwait =A00 = =A0 1:43 =A00.00% supervise > 35401 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 544K vmwait =A00 = =A0 1:42 =A00.00% supervise > 29203 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 544K vmwait =A00 = =A0 1:42 =A00.00% supervise > 21629 =A0 =A0389 =A0 =A0 =A06 =A044 =A0 =A00 91664K 41892K ucond =A0 0 = =A0 1:02 =A00.00% slapd > 72283 =A0 =A0 60 =A0 =A0 =A01 =A044 =A0 =A00 80972K =A01948K select =A01 = =A0 0:34 =A00.00% idled > 98960 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A09396K =A02544K select =A01 = =A0 0:32 =A00.00% sshd > 1550 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A03340K =A0 940K vmwait =A01 = =A0 0:32 =A00.00% syslogd > 5463 =A0 =A0125 =A0 =A0 =A01 =A044 =A0 =A00 =A06924K =A02036K vmwait =A00= =A0 0:27 =A00.00% qmgr > 54193 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A09396K =A02516K select =A00 = =A0 0:22 =A00.00% sshd > > I could not log into the console, it didn't even gave a "user name" filed > after hitting enter. Strange. > > I will try the patch. > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 16:13:06 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AABC11065679 for ; Mon, 5 Oct 2009 16:13:06 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 0D1408FC17 for ; Mon, 5 Oct 2009 16:13:05 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 7A6B845E87; Mon, 5 Oct 2009 18:13:03 +0200 (CEST) Received: from localhost (pdawidek.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id B25D345683; Mon, 5 Oct 2009 18:12:58 +0200 (CEST) Date: Mon, 5 Oct 2009 18:12:58 +0200 From: Pawel Jakub Dawidek To: Charles Owens Message-ID: <20091005161258.GE1702@garage.freebsd.pl> References: <4ACA015D.3090800@greatbaysoftware.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="47eKBCiAZYFK5l32" Content-Disposition: inline In-Reply-To: <4ACA015D.3090800@greatbaysoftware.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 16:13:06 -0000 --47eKBCiAZYFK5l32 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 05, 2009 at 10:23:25AM -0400, Charles Owens wrote: > Hello folks, >=20 > We've had a system crash, apparently related to GEOM_JOURNAL, on an i386 > system running 7.0-RELEASE-p11. Here's what we could see on the screen > (formatted for readability): >=20 > GEOM_JOURNAL: [copy] Error while writting data (error=3D6) \ > ad4s1a[WRITE(offset=3D43561402368, length=3D16384)] > GEOM_JOURNAL: [copy] Error while writting data (error=3D6) \ > ad4s1a[WRITE(offset=3D48868164096, length=3D16896)] > GEOM_JOURNAL: Error while reading data from ad4s1a (error=3D6). Error 6 (ENXIO, man errno(2)) might mean that ad4s1a disappeared. There were no any errors earlier indicating that ad4 was disconnected or similar? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --47eKBCiAZYFK5l32 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKyhsKForvXbEpPzQRAsZXAJ44teQktHnG8wEXHqrGdTO5eMkMygCg6dif 5Tk3ne5M4kLwdwVcATvPCxY= =QEdr -----END PGP SIGNATURE----- --47eKBCiAZYFK5l32-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 16:26:23 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4437B106568D; Mon, 5 Oct 2009 16:26:23 +0000 (UTC) (envelope-from cowens@greatbaysoftware.com) Received: from portcityhosting.com (bayringfw.portcityweb.com [64.140.243.92]) by mx1.freebsd.org (Postfix) with ESMTP id D42AF8FC47; Mon, 5 Oct 2009 16:26:22 +0000 (UTC) Received: from [127.0.0.1] ([173.14.128.81]) by portcityhosting.com with MailEnable ESMTP; Mon, 5 Oct 2009 12:26:23 -0400 X-WatchGuard-Mail-Exception: Allow Message-ID: <4ACA1EEA.3070204@greatbaysoftware.com> Date: Mon, 05 Oct 2009 12:29:30 -0400 From: Charles Owens MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4ACA015D.3090800@greatbaysoftware.com> <20091005161258.GE1702@garage.freebsd.pl> In-Reply-To: <20091005161258.GE1702@garage.freebsd.pl> Content-Type: multipart/mixed; boundary="------------000101080209090701020205" X-WatchGuard-AntiVirus: part scanned. clean action=allow X-ME-Bayesian: 0.000000 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 16:26:23 -0000 This is a multi-part message in MIME format. --------------000101080209090701020205 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-WatchGuard-AntiVirus: part scanned. clean action=allow Pawel Jakub Dawidek wrote: > On Mon, Oct 05, 2009 at 10:23:25AM -0400, Charles Owens wrote: > >> Hello folks, >> >> We've had a system crash, apparently related to GEOM_JOURNAL, on an i386 >> system running 7.0-RELEASE-p11. Here's what we could see on the screen >> (formatted for readability): >> >> GEOM_JOURNAL: [copy] Error while writting data (error=6) \ >> ad4s1a[WRITE(offset=43561402368, length=16384)] >> GEOM_JOURNAL: [copy] Error while writting data (error=6) \ >> ad4s1a[WRITE(offset=48868164096, length=16896)] >> GEOM_JOURNAL: Error while reading data from ad4s1a (error=6). >> > > Error 6 (ENXIO, man errno(2)) might mean that ad4s1a disappeared. There > were no any errors earlier indicating that ad4 was disconnected or > similar? > Such as if someone had come by and temporarily pulled out the drive? Nothing in the logs of the sort, no. Do you think this is unrelated to the offset-zero layout question? In any case, should we be worried about that? Thanks for the reply. --------------000101080209090701020205-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 16:59:00 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0B6CB1065670 for ; Mon, 5 Oct 2009 16:59:00 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 4BEBD8FC0A for ; Mon, 5 Oct 2009 16:58:59 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 66DE145CDD; Mon, 5 Oct 2009 18:58:57 +0200 (CEST) Received: from localhost (pdawidek.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 6DF7B45C8A; Mon, 5 Oct 2009 18:58:52 +0200 (CEST) Date: Mon, 5 Oct 2009 18:58:52 +0200 From: Pawel Jakub Dawidek To: Charles Owens Message-ID: <20091005165852.GF1702@garage.freebsd.pl> References: <4ACA015D.3090800@greatbaysoftware.com> <20091005161258.GE1702@garage.freebsd.pl> <4ACA1EEA.3070204@greatbaysoftware.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="cz6wLo+OExbGG7q/" Content-Disposition: inline In-Reply-To: <4ACA1EEA.3070204@greatbaysoftware.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 16:59:00 -0000 --cz6wLo+OExbGG7q/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 05, 2009 at 12:29:30PM -0400, Charles Owens wrote: > Pawel Jakub Dawidek wrote: > > On Mon, Oct 05, 2009 at 10:23:25AM -0400, Charles Owens wrote: > > =20 > >> Hello folks, > >> > >> We've had a system crash, apparently related to GEOM_JOURNAL, on an i3= 86 > >> system running 7.0-RELEASE-p11. Here's what we could see on the scre= en > >> (formatted for readability): > >> > >> GEOM_JOURNAL: [copy] Error while writting data (error=3D6) \ > >> ad4s1a[WRITE(offset=3D43561402368, length=3D16384)] > >> GEOM_JOURNAL: [copy] Error while writting data (error=3D6) \ > >> ad4s1a[WRITE(offset=3D48868164096, length=3D16896)] > >> GEOM_JOURNAL: Error while reading data from ad4s1a (error=3D6). > >> =20 > > > > Error 6 (ENXIO, man errno(2)) might mean that ad4s1a disappeared. There > > were no any errors earlier indicating that ad4 was disconnected or > > similar? > > =20 >=20 > Such as if someone had come by and temporarily pulled out the drive? =20 It can be buggy ata controller or failing disk. > Nothing in the logs of the sort, no. What is exact partition size in bytes? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --cz6wLo+OExbGG7q/ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKyiXMForvXbEpPzQRAmaKAKC/pcz22nl3pIEuksNrSxrOhrXVVwCg2noH 22OUfkxw8ZeVckzwsgdt6hU= =6yiy -----END PGP SIGNATURE----- --cz6wLo+OExbGG7q/-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 17:53:46 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 01C501065672; Mon, 5 Oct 2009 17:53:46 +0000 (UTC) (envelope-from cowens@greatbaysoftware.com) Received: from portcityhosting.com (bayringfw.portcityweb.com [64.140.243.92]) by mx1.freebsd.org (Postfix) with ESMTP id 7F9E48FC0C; Mon, 5 Oct 2009 17:53:45 +0000 (UTC) Received: from [127.0.0.1] ([173.14.128.81]) by portcityhosting.com with MailEnable ESMTP; Mon, 5 Oct 2009 13:53:45 -0400 X-WatchGuard-Mail-Exception: Allow Message-ID: <4ACA3363.1090309@greatbaysoftware.com> Date: Mon, 05 Oct 2009 13:56:51 -0400 From: Charles Owens MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4ACA015D.3090800@greatbaysoftware.com> <20091005161258.GE1702@garage.freebsd.pl> <4ACA1EEA.3070204@greatbaysoftware.com> <20091005165852.GF1702@garage.freebsd.pl> In-Reply-To: <20091005165852.GF1702@garage.freebsd.pl> Content-Type: multipart/mixed; boundary="------------050008040402030407040109" X-WatchGuard-AntiVirus: part scanned. clean action=allow X-ME-Bayesian: 0.000000 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 17:53:46 -0000 This is a multi-part message in MIME format. --------------050008040402030407040109 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-WatchGuard-AntiVirus: part scanned. clean action=allow Pawel Jakub Dawidek wrote: > On Mon, Oct 05, 2009 at 12:29:30PM -0400, Charles Owens wrote: > >> Pawel Jakub Dawidek wrote: >> >>> On Mon, Oct 05, 2009 at 10:23:25AM -0400, Charles Owens wrote: >>> >>> >>>> Hello folks, >>>> >>>> We've had a system crash, apparently related to GEOM_JOURNAL, on an i386 >>>> system running 7.0-RELEASE-p11. Here's what we could see on the screen >>>> (formatted for readability): >>>> >>>> GEOM_JOURNAL: [copy] Error while writting data (error=6) \ >>>> ad4s1a[WRITE(offset=43561402368, length=16384)] >>>> GEOM_JOURNAL: [copy] Error while writting data (error=6) \ >>>> ad4s1a[WRITE(offset=48868164096, length=16896)] >>>> GEOM_JOURNAL: Error while reading data from ad4s1a (error=6). >>>> >>>> >>> Error 6 (ENXIO, man errno(2)) might mean that ad4s1a disappeared. There >>> were no any errors earlier indicating that ad4 was disconnected or >>> similar? >>> >>> >> Such as if someone had come by and temporarily pulled out the drive? >> > > It can be buggy ata controller or failing disk. > > >> Nothing in the logs of the sort, no. >> > > What is exact partition size in bytes? > >From 'bsdlabel ad4s1': a: 77594624 0 4.2BSD 2048 16384 28552 77594624 * 512 = 39728447488 (37 GB) **Charles Owens** *Great Bay Software***** --------------050008040402030407040109-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 18:27:08 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB1831065679; Mon, 5 Oct 2009 18:27:08 +0000 (UTC) (envelope-from cowens@greatbaysoftware.com) Received: from portcityhosting.com (bayringfw.portcityweb.com [64.140.243.92]) by mx1.freebsd.org (Postfix) with ESMTP id 14DC98FC08; Mon, 5 Oct 2009 18:27:07 +0000 (UTC) Received: from [127.0.0.1] ([173.14.128.81]) by portcityhosting.com with MailEnable ESMTP; Mon, 5 Oct 2009 14:27:08 -0400 X-WatchGuard-Mail-Exception: Allow Message-ID: <4ACA3B37.6070501@greatbaysoftware.com> Date: Mon, 05 Oct 2009 14:30:15 -0400 From: Charles Owens MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4ACA015D.3090800@greatbaysoftware.com> <20091005161258.GE1702@garage.freebsd.pl> <4ACA1EEA.3070204@greatbaysoftware.com> <20091005165852.GF1702@garage.freebsd.pl> In-Reply-To: <20091005165852.GF1702@garage.freebsd.pl> Content-Type: multipart/mixed; boundary="------------060900070605020605020408" X-WatchGuard-AntiVirus: part scanned. clean action=allow X-ME-Bayesian: 0.000000 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 18:27:08 -0000 This is a multi-part message in MIME format. --------------060900070605020605020408 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-WatchGuard-AntiVirus: part scanned. clean action=allow Pawel Jakub Dawidek wrote: > On Mon, Oct 05, 2009 at 12:29:30PM -0400, Charles Owens wrote: > >> Pawel Jakub Dawidek wrote: >> >>> On Mon, Oct 05, 2009 at 10:23:25AM -0400, Charles Owens wrote: >>> >>> >>>> Hello folks, >>>> >>>> We've had a system crash, apparently related to GEOM_JOURNAL, on an i386 >>>> system running 7.0-RELEASE-p11. Here's what we could see on the screen >>>> (formatted for readability): >>>> >>>> GEOM_JOURNAL: [copy] Error while writting data (error=6) \ >>>> ad4s1a[WRITE(offset=43561402368, length=16384)] >>>> GEOM_JOURNAL: [copy] Error while writting data (error=6) \ >>>> ad4s1a[WRITE(offset=48868164096, length=16896)] >>>> GEOM_JOURNAL: Error while reading data from ad4s1a (error=6). >>>> >>>> >>> Error 6 (ENXIO, man errno(2)) might mean that ad4s1a disappeared. There >>> were no any errors earlier indicating that ad4 was disconnected or >>> similar? >>> >>> >> Such as if someone had come by and temporarily pulled out the drive? >> > > It can be buggy ata controller or failing disk. > > >> Nothing in the logs of the sort, no. >> > > What is exact partition size in bytes Sorry, I'd grabbed the size from a similar system which I realized has a smaller drive. Here's the partition size from the real system in question: 59055800320 (55 GB) While I'm at it, here's some additional detail on gjournal: # gjournal list Geom name: gjournal 2048257491 ID: 2048257491 Providers: 1. Name: ad4s1a.journal Mediasize: 45182090752 (42G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad4s1a Mediasize: 59055800320 (55G) Sectorsize: 512 Mode: r1w1e1 Jend: 59055799808 Jstart: 45182090752 Role: Data,Journal Geom name: gjournal 3790277183 ID: 3790277183 Providers: 1. Name: ad4s1d.journal Mediasize: 168868626944 (157G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad4s1d Mediasize: 182742336512 (170G) Sectorsize: 512 Mode: r1w1e1 Jend: 182742336000 Jstart: 168868626944 Role: Data,Journal # bsdlabel ad4s1 # /dev/ad4s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 115343360 0 4.2BSD 2048 16384 28552 b: 16130016 115343360 swap c: 488392002 0 unused 0 0 # "raw" part, don't edit d: 356918626 131473376 4.2BSD 2048 16384 28552 **Charles Owens** *Great Bay Software* --------------060900070605020605020408-- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 5 18:28:34 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 003A9106566B; Mon, 5 Oct 2009 18:28:33 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yx0-f184.google.com (mail-yx0-f184.google.com [209.85.210.184]) by mx1.freebsd.org (Postfix) with ESMTP id 6626F8FC0C; Mon, 5 Oct 2009 18:28:33 +0000 (UTC) Received: by yxe14 with SMTP id 14so4183840yxe.7 for ; Mon, 05 Oct 2009 11:28:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=+jMFWEDWcEi2OyS0hHNqSQvDUrlE15u6cfaTFptYegk=; b=eOFkx4djNY8jkFfdijVm5+8i4t91zMyc2FSfX5zbOw9CB8Cefk7rEmCDc0NJuyEflT Fp09A9ovCLAZjBRGbgOc3j8N3WbneoV2FyDbPB7tQWnr6OXSU5+vug6TVmR5Cp0i7wt8 sHynWA8RwfK15WWa9ySbUANwtm0BXyhZ/aplk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=C26FMtaIFiYAOs8AY+EKc1uVNmT1rDUpaNdYAfb6OFs+pgp9mZYGVD+9jnRDE55EIN nY5zo8apy91wDTZD8HRVmleqgNBs577gJJPq14ukZxWFSgK0STMiTQWCgi+F8WGWfqXz 3g/GR5m/ph97StH9pUFCh1hkwVmh9aEiFU5NU= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.90.16.34 with SMTP id 34mr157806agp.47.1254767312445; Mon, 05 Oct 2009 11:28:32 -0700 (PDT) In-Reply-To: References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> <20091003000909.GD1660@garage.freebsd.pl> Date: Mon, 5 Oct 2009 11:28:32 -0700 X-Google-Sender-Auth: e3489b7891f75ece Message-ID: From: Artem Belevich To: Pawel Jakub Dawidek Content-Type: multipart/mixed; boundary=001636283b62ef22e804753447f5 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2009 18:28:34 -0000 --001636283b62ef22e804753447f5 Content-Type: text/plain; charset=ISO-8859-1 I left my box running over the weekend (vm.kmem_size=8G) in a loop doing a build and then deleting build results. Each loop cycle takes about 5 hours. All in all it touches about 2GB of sources and produces about 30GB of object files and stuff. This morning ARC size is around 2.5G. Now and then it dips down to 1G. I've attached graph with memory stats and ARC size. --Artem > =============================================================== > Now, the same experiment, with vm.kmem_size=8G > vm.kmem_size: 8589934592 > vfs.zfs.arc_min: 939524096 > vfs.zfs.arc_max: 7516192768 > > ARC grows to 6.2G: > Mem: 47M Active, 13M Inact, 7376M Wired, 31M Buf, 473M Free > > Then it quickly shrinks to 4.6G and grows to 6.2G again, shrinks again, etc.. > > What's different from the previous case is that after a while ZFS > adjusts target size (kstat.zfs.misc.arcstats.c) down to ~5.8G and > after that ZFS size oscillates between 4.2G and 5.6G. Another > observation -- ARC shrinking happens when system is left with ~512M of > free memory. Yet another observation is that even with ARC peak of > ~5.8G, system has about 7.5G wired. Where did almost 2G of difference > go? Fragmentation? > > I've tried both experiments with and without L2ARC -- behavior seems > to be the same. > > --Artem > --001636283b62ef22e804753447f5-- From owner-freebsd-fs@FreeBSD.ORG Tue Oct 6 04:30:20 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15BB91065670; Tue, 6 Oct 2009 04:30:20 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E06768FC0C; Tue, 6 Oct 2009 04:30:19 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n964UJYr090016; Tue, 6 Oct 2009 04:30:19 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n964UJtK090006; Tue, 6 Oct 2009 04:30:19 GMT (envelope-from linimon) Date: Tue, 6 Oct 2009 04:30:19 GMT Message-Id: <200910060430.n964UJtK090006@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-s@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/139363: [nfs] diskless root nfs mount from non FreeBSD server broken X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Oct 2009 04:30:20 -0000 Synopsis: [nfs] diskless root nfs mount from non FreeBSD server broken Responsible-Changed-From-To: freebsd-s->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Tue Oct 6 04:30:06 UTC 2009 Responsible-Changed-Why: gonna have to replace this keyboard soon ... http://www.freebsd.org/cgi/query-pr.cgi?pr=139363 From owner-freebsd-fs@FreeBSD.ORG Tue Oct 6 18:39:47 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D23711065692 for ; Tue, 6 Oct 2009 18:39:47 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 1BD258FC24 for ; Tue, 6 Oct 2009 18:39:46 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 076E645EA4; Tue, 6 Oct 2009 20:39:44 +0200 (CEST) Received: from localhost (chello087206049004.chello.pl [87.206.49.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id E5A0645B36; Tue, 6 Oct 2009 20:39:38 +0200 (CEST) Date: Tue, 6 Oct 2009 20:39:37 +0200 From: Pawel Jakub Dawidek To: Charles Owens Message-ID: <20091006183937.GA1639@garage.freebsd.pl> References: <4ACA015D.3090800@greatbaysoftware.com> <20091005161258.GE1702@garage.freebsd.pl> <4ACA1EEA.3070204@greatbaysoftware.com> <20091005165852.GF1702@garage.freebsd.pl> <4ACA3B37.6070501@greatbaysoftware.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7AUc2qLy4jB3hD7Z" Content-Disposition: inline In-Reply-To: <4ACA3B37.6070501@greatbaysoftware.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: gjournal crash: "error while writing data (error=6)" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Oct 2009 18:39:47 -0000 --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 05, 2009 at 02:30:15PM -0400, Charles Owens wrote: > >>>> GEOM_JOURNAL: [copy] Error while writting data (error=3D6) \ > >>>> ad4s1a[WRITE(offset=3D43561402368, length=3D16384)] > >>>> GEOM_JOURNAL: [copy] Error while writting data (error=3D6) \ > >>>> ad4s1a[WRITE(offset=3D48868164096, length=3D16896)] [...] > Here's the partition size from the real system in question: 59055800320 > (55 GB) That's better, now we know that gjournal was using valid offset. This still doesn't solve your mistery, but whatever it is, it doesn't look like gjournal, because the error you are seeing was received from provider below (ad4s1a). The error value suggest that this partition disappeared or something else bad happen. The logs you pasted don't tell us what, so I can only guess. As I suggested earlier it could be that disk was detached on an error. It happends sometimes in my home file server - it controller doesn't report errors, but detached the disk instead. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --7AUc2qLy4jB3hD7Z Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKy47pForvXbEpPzQRAsO3AJ49T6F/OPmZj/HxJjgqg4eKU9qoJwCfczDR L6c9UygUsMLQ/0EQXkSPbNc= =8EHE -----END PGP SIGNATURE----- --7AUc2qLy4jB3hD7Z-- From owner-freebsd-fs@FreeBSD.ORG Wed Oct 7 18:10:41 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 581B41065676; Wed, 7 Oct 2009 18:10:41 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2F68F8FC1E; Wed, 7 Oct 2009 18:10:41 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n97IAfHw029274; Wed, 7 Oct 2009 18:10:41 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n97IAfGW029270; Wed, 7 Oct 2009 18:10:41 GMT (envelope-from linimon) Date: Wed, 7 Oct 2009 18:10:41 GMT Message-Id: <200910071810.n97IAfGW029270@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/139407: [smbfs] [panic] smb mount causes system crash if remote share no longer accessible X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Oct 2009 18:10:41 -0000 Old Synopsis: smb mount causes system crash if remote share no longer accessible New Synopsis: [smbfs] [panic] smb mount causes system crash if remote share no longer accessible Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Oct 7 18:10:01 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=139407 From owner-freebsd-fs@FreeBSD.ORG Wed Oct 7 23:18:35 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4B47106568D; Wed, 7 Oct 2009 23:18:35 +0000 (UTC) (envelope-from delphij@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BB48F8FC24; Wed, 7 Oct 2009 23:18:35 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n97NIZ8P090132; Wed, 7 Oct 2009 23:18:35 GMT (envelope-from delphij@freefall.freebsd.org) Received: (from delphij@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n97NIZ7S090128; Wed, 7 Oct 2009 23:18:35 GMT (envelope-from delphij) Date: Wed, 7 Oct 2009 23:18:35 GMT Message-Id: <200910072318.n97NIZ7S090128@freefall.freebsd.org> To: citrin@citrin.ru, delphij@FreeBSD.org, freebsd-fs@FreeBSD.org, delphij@FreeBSD.org From: delphij@FreeBSD.org Cc: Subject: Re: kern/127213: [tmpfs] [patch] sendfile on tmpfs data corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Oct 2009 23:18:36 -0000 Synopsis: [tmpfs] [patch] sendfile on tmpfs data corruption State-Changed-From-To: open->patched State-Changed-By: delphij State-Changed-When: Wed Oct 7 23:17:33 UTC 2009 State-Changed-Why: Patch from gk@ has been applied against -HEAD, MFC reminder. Responsible-Changed-From-To: freebsd-fs->delphij Responsible-Changed-By: delphij Responsible-Changed-When: Wed Oct 7 23:17:33 UTC 2009 Responsible-Changed-Why: Take. http://www.freebsd.org/cgi/query-pr.cgi?pr=127213 From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 00:22:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 02A041065670; Thu, 8 Oct 2009 00:22:16 +0000 (UTC) (envelope-from oliver.pntr@gmail.com) Received: from mail-gx0-f214.google.com (mail-gx0-f214.google.com [209.85.217.214]) by mx1.freebsd.org (Postfix) with ESMTP id 865168FC08; Thu, 8 Oct 2009 00:22:15 +0000 (UTC) Received: by gxk6 with SMTP id 6so4952358gxk.13 for ; Wed, 07 Oct 2009 17:22:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=sYWHTuw/2oF0MfJEcQAuJ/LkTi+pBtmSehEDg9cPCKE=; b=hp2oxbmR1NktVCL6oP80rm79cx35dJC8rHwoOTPadLdHRQp+BRLziucgs7g4aiGO2W wm+Gflf3vdo0tSgaq/2NeCWhx/fIhk/vcBunA0vC21BGsZNZaTxsksAtUKdlCm8mQSS9 0mxlyR+vAf6n4nxSJJAXXqS/vOC7BA60hJGLc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=b9Cswspi5E/LN4ezVxoybqFMfORHfxZKvFEaAYRNqN0Mf4HkVsbBq5un+XuIyA+6Lm vTuWusWEcuBj43YvHb6biAmKNJtEvj+7a3U59mvkjcC6Ds9yEpGeIJvyBBBhEe467/He oieHgQ8QsYaX5gb27EolnQFrdju/wK+ArG2RY= MIME-Version: 1.0 Received: by 10.91.191.17 with SMTP id t17mr345266agp.51.1254960027608; Wed, 07 Oct 2009 17:00:27 -0700 (PDT) In-Reply-To: <200910071810.n97IAfGW029270@freefall.freebsd.org> References: <200910071810.n97IAfGW029270@freefall.freebsd.org> Date: Thu, 8 Oct 2009 02:00:27 +0200 Message-ID: <6101e8c40910071700q62982a0aqba04e8cd84d4f4b8@mail.gmail.com> From: Oliver Pinter To: linimon@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, freebsd-bugs@freebsd.org Subject: Re: kern/139407: [smbfs] [panic] smb mount causes system crash if remote share no longer accessible X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 00:22:16 -0000 http://lists.freebsd.org/pipermail/freebsd-stable/2009-July/051099.html On 10/7/09, linimon@freebsd.org wrote: > Old Synopsis: smb mount causes system crash if remote share no longer > accessible > New Synopsis: [smbfs] [panic] smb mount causes system crash if remote share > no longer accessible > > Responsible-Changed-From-To: freebsd-bugs->freebsd-fs > Responsible-Changed-By: linimon > Responsible-Changed-When: Wed Oct 7 18:10:01 UTC 2009 > Responsible-Changed-Why: > Over to maintainer(s). > > http://www.freebsd.org/cgi/query-pr.cgi?pr=139407 > _______________________________________________ > freebsd-bugs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-bugs > To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 05:07:32 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D8541065672 for ; Thu, 8 Oct 2009 05:07:32 +0000 (UTC) (envelope-from mattjeet@gmail.com) Received: from mail-vw0-f180.google.com (mail-vw0-f180.google.com [209.85.212.180]) by mx1.freebsd.org (Postfix) with ESMTP id 44F1A8FC0A for ; Thu, 8 Oct 2009 05:07:31 +0000 (UTC) Received: by vws10 with SMTP id 10so3141621vws.7 for ; Wed, 07 Oct 2009 22:07:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:reply-to:received:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=3IeclnxQf9uaZs58x6thtuw7iBDX5iVOJuWZgRwa/H0=; b=a08PcHcdSKjo0OXuwuey8cg+hIHUh/aMbDhZUN+5J3JE2ResH/lRCI1zITql201OQz MWEoobjOR6sEik5eGqlpYyprJBUuaLetoYCmSw1xGEskiP8s7GOmTeruppfR0duxe+49 WCyP7Fxe2rzonuLwP5FrWzsy0+GppwFLz9BP0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:reply-to:date:x-google-sender-auth:message-id :subject:from:to:content-type; b=d+Bxx0FCFgb22OvdYjrz7j0D+Gm92fi/AKfn9RFdrGyumP0ElxvELisjsnHUugZtLX YNXkehLvmVoxsBE9poYwy19cKtR6ZaXDOvs+u1kwjBm9S9mJZzhtkyE/oGa5OvEckt8G vbKSOuKJxemKUrJui85F27fv62L02vAInMBzc= MIME-Version: 1.0 Sender: mattjeet@gmail.com Received: by 10.220.104.212 with SMTP id q20mr1201454vco.107.1254976833854; Wed, 07 Oct 2009 21:40:33 -0700 (PDT) Date: Wed, 7 Oct 2009 21:40:33 -0700 X-Google-Sender-Auth: dd7105beb95db32b Message-ID: <9740caf0910072140s6c12ebbm4aa6e1b6019b2d50@mail.gmail.com> From: Matt Olander To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: glusterfs/zfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: matt@ixsystems.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 05:07:32 -0000 We've got an opportunity to work with a popular open source project building a server farm for their bug reporting database. They are running out of inodes on some expensive NetApp equipment so FreeBSD and ZFS look like a good potential solution. Does anyone have experience running glusterfs on FreeBSD 8 yet? They'd like to use glusterfs for this setup. best, -matt From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 08:42:23 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FC94106566B; Thu, 8 Oct 2009 08:42:23 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id E06CA8FC1B; Thu, 8 Oct 2009 08:42:22 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 83AD1141C2C; Thu, 8 Oct 2009 10:42:20 +0200 (CEST) X-CRM114-Version: 20090423-BlameSteveJobs ( TRE 0.7.6 (BSD) ) MF-ACE0E1EA [pR: 15.5708] X-CRM114-CacheID: sfid-20091008_10422_61AB5602 X-CRM114-Status: Good ( pR: 15.5708 ) Message-ID: <4ACDA5EA.2010600@fsn.hu> Date: Thu, 08 Oct 2009 10:42:18 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> In-Reply-To: <20091002184526.GA1660@garage.freebsd.pl> X-Stationery: 0.4.10 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.3 (people.fsn.hu); Thu, 08 Oct 2009 10:42:19 +0200 (CEST) Cc: freebsd-fs@FreeBSD.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 08:42:23 -0000 Hello, Pawel Jakub Dawidek wrote: > On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: > >> Backing out this change from the 8-STABLE kernel: >> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 >> >> makes it survive about half and hour of IMAP searching. Of course only >> time will tell whether this helps in the long run, but so far 10/10 >> tries succeeded to kill the machine with this method... >> > > Could you try this patch: > > http://people.freebsd.org/~pjd/patches/arc.c.4.patch > It seems (after running for two days) that this fixes my problem. And I see that Kip has came out with a similar version (which I couldn't yet test, but hope that will also do). Thanks! From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 09:37:23 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 524EA106568F for ; Thu, 8 Oct 2009 09:37:23 +0000 (UTC) (envelope-from solon@pyro.de) Received: from srv23.fsb.echelon.bnd.org (mail.pyro.de [83.137.99.96]) by mx1.freebsd.org (Postfix) with ESMTP id EE3F98FC13 for ; Thu, 8 Oct 2009 09:37:22 +0000 (UTC) Received: from port-87-193-183-44.static.qsc.de ([87.193.183.44] helo=flash.home) by srv23.fsb.echelon.bnd.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1MvpR0-000Fzl-1b for freebsd-fs@freebsd.org; Thu, 08 Oct 2009 11:37:21 +0200 Date: Thu, 8 Oct 2009 11:37:16 +0200 From: Solon Lutz X-Mailer: The Bat! (v3.99.25) Professional Organization: pyro.labs berlin X-Priority: 3 (Normal) Message-ID: <886802879.20091008113716@pyro.de> To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -1.4 (-) X-Spam-Report: Spam detection software, running on the system "srv23.fsb.echelon.bnd.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see The administrator of that system for details. Content preview: I built a 9x hdd 11TB raidz for some rescue purposes and started copying an image from another partition via "dd if=/dev/da0..." to it. It consists of: ad4 da1 da2 da3 da4 da5 da6 da7 da8, da1 to da8 are connected via two highpoint controllers. [...] Content analysis details: (-1.4 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.4 ALL_TRUSTED Passed through trusted hosts only via SMTP X-Spam-Flag: NO Subject: raidz slowing down X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 09:37:23 -0000 I built a 9x hdd 11TB raidz for some rescue purposes and started copying an image from another partition via "dd if=3D/dev/da0..." to it. It consists of: ad4 da1 da2 da3 da4 da5 da6 da7 da8, da1 to da8 are connected via two highpoint controllers. In the beginning write speeds were quite fair: dT: 1.002s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 424 0 0 0.0 424 52483 33.9 84.6| ad4 0 0 0 0 0.0 0 0 0.0 0.0| da0 35 356 0 0 0.0 356 44584 76.4 124.5| da1 35 296 0 0 0.0 296 36919 84.5 121.0| da2 34 361 0 0 0.0 361 45111 75.5 124.7| da3 35 346 0 0 0.0 346 43196 78.6 123.2| da4 35 344 0 0 0.0 344 42940 80.0 124.7| da5 35 343 0 0 0.0 343 42812 80.7 124.5| da6 35 344 0 0 0.0 344 43051 79.8 123.9| da7 34 342 0 0 0.0 342 42796 80.6 124.4| da8 Now, some 10 hours and 2.5TB later, it look like that most of the time: dT: 1.002s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 10 0 0 0.0 10 6 0.8 0.2| ad4 0 0 0 0 0.0 0 0 0.0 0.0| da0 4 13 0 0 0.0 13 8 550.4 178.5| da1 0 12 0 0 0.0 12 7 0.7 0.2| da2 0 11 0 0 0.0 11 7 0.7 0.2| da3 0 10 0 0 0.0 10 5 0.6 0.2| da4 0 11 0 0 0.0 11 6 0.9 0.3| da5 0 12 0 0 0.0 12 7 0.7 0.2| da6 0 11 0 0 0.0 11 7 0.7 0.2| da7 0 9 0 0 0.0 9 6 0.8 0.2| da8 da1 seems to be busy most of time and every few seconds all the other devices write some data with nearly normal speed: dT: 1.003s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 254 0 0 0.0 254 31331 34.9 35.4| ad4 0 0 0 0 0.0 0 0 0.0 0.0| da0 4 0 0 0 0.0 0 0 0.0 0.0| da1 0 254 0 0 0.0 254 31346 107.4 104.5| da2 0 256 0 0 0.0 256 31345 108.1 104.0| da3 0 255 0 0 0.0 255 31345 110.2 105.1| da4 35 200 0 0 0.0 200 24912 143.3 115.0| da5 35 211 0 0 0.0 211 26303 137.8 114.9| da6 35 210 0 0 0.0 210 26079 139.3 114.9| da7 35 209 0 0 0.0 209 25952 135.2 113.7| da8 Sometimes it even gets back to 'normal' behaviour, but never reaches the speeds it once had: dT: 1.002s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 35 274 0 0 0.0 274 34334 44.2 66.6| ad4 0 1166 1166 149243 0.1 0 0 0.0 14.3| da0 35 120 0 0 0.0 120 14717 94.4 64.5| da1 35 96 0 0 0.0 96 11665 113.9 64.3| da2 35 100 0 0 0.0 100 12288 98.7 63.9| da3 35 103 0 0 0.0 103 12496 93.4 59.4| da4 34 112 0 0 0.0 112 13694 106.1 67.4| da5 35 71 0 0 0.0 71 8596 115.3 66.8| da6 35 116 0 0 0.0 116 14205 101.7 67.3| da7 35 83 0 0 0.0 83 10066 112.2 65.9| da8 Syslog reports the following: Oct 8 09:53:40 radium kernel: hptrr: start channel [0,0] Oct 8 09:53:40 radium kernel: hptrr: channel [0,0] started successfully Oct 8 09:57:44 radium kernel: hptrr: start channel [0,0] Oct 8 09:57:45 radium kernel: hptrr: channel [0,0] started successfully Oct 8 10:54:26 radium kernel: hptrr: start channel [0,0] Oct 8 10:54:27 radium kernel: hptrr: channel [0,0] started successfully Oct 8 11:10:29 radium kernel: hptrr: start channel [0,0] Oct 8 11:10:30 radium kernel: hptrr: channel [0,0] started successfully Oct 8 11:17:27 radium kernel: hptrr: start channel [0,0] Oct 8 11:17:27 radium kernel: hptrr: channel [0,0] started successfully Is this a problem of the hptrr device or is da1 failing? Mit freundlichen Gr=FC=DFen Best regards, Solon Lutz +-----------------------------------------------+ | Pyro.Labs Berlin - Creativity for tomorrow | | Wasgenstrasse 75/13 - 14129 Berlin, Germany | | www.pyro.de - phone + 49 - 30 - 48 48 58 58 | | info@pyro.de - fax + 49 - 30 - 80 94 03 52 | +-----------------------------------------------+ From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 12:45:10 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF4BE1065672; Thu, 8 Oct 2009 12:45:09 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id E31AB8FC0A; Thu, 8 Oct 2009 12:45:08 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 2C8EE141099; Thu, 8 Oct 2009 14:45:07 +0200 (CEST) X-CRM114-Version: 20090423-BlameSteveJobs ( TRE 0.7.6 (BSD) ) MF-ACE0E1EA [pR: 18.8731] X-CRM114-CacheID: sfid-20091008_14450_05B4CBDC X-CRM114-Status: Good ( pR: 18.8731 ) Message-ID: <4ACDDED0.2070707@fsn.hu> Date: Thu, 08 Oct 2009 14:45:04 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> <4ACDA5EA.2010600@fsn.hu> In-Reply-To: <4ACDA5EA.2010600@fsn.hu> X-Stationery: 0.4.10 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.3 (people.fsn.hu); Thu, 08 Oct 2009 14:45:05 +0200 (CEST) Cc: freebsd-fs@FreeBSD.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 12:45:10 -0000 Attila Nagy wrote: > Hello, > > Pawel Jakub Dawidek wrote: >> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: >> >>> Backing out this change from the 8-STABLE kernel: >>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 >>> >>> >>> makes it survive about half and hour of IMAP searching. Of course >>> only time will tell whether this helps in the long run, but so far >>> 10/10 tries succeeded to kill the machine with this method... >>> >> >> Could you try this patch: >> >> http://people.freebsd.org/~pjd/patches/arc.c.4.patch >> > It seems (after running for two days) that this fixes my problem. And > I see that Kip has came out with a similar version (which I couldn't > yet test, but hope that will also do). It seems that I was a little bit quick regarding this. The machine just stopped with this: last pid: 32358; load averages: 0.01, 0.04, 0.12 up 2+06:33:56 14:36:25 114 processes: 1 running, 112 sleeping, 1 zombie CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf Swap: 4096M Total, 15M Used, 4081M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 24025 root 1 44 0 3932K 992K vmwait 0 6:06 0.00% zpool 84190 root 1 44 0 4700K 1592K CPU1 1 4:17 0.00% top 99029 root 1 44 0 4132K 1212K nanslp 1 3:53 0.00% gstat 26317 root 1 44 0 1528K 352K piperd 1 3:38 0.00% readproctitl 49143 125 4 45 0 12248K 3788K sigwai 0 2:50 0.00% milter-greyl 39969 root 1 44 0 1536K 516K vmwait 0 2:50 0.00% supervise 40241 root 1 44 0 1536K 516K vmwait 0 2:47 0.00% supervise 44633 root 1 44 0 1536K 512K vmwait 0 2:43 0.00% supervise 43434 root 1 44 0 1536K 516K vmwait 0 2:43 0.00% supervise 50575 root 1 44 0 1536K 516K vmwait 0 2:42 0.00% supervise 45510 root 1 44 0 1536K 512K vmwait 0 2:42 0.00% supervise 58146 60 1 44 0 264M 8828K pfault 0 2:32 0.00% imapd 47526 389 6 44 0 92688K 2296K ucond 1 1:29 0.00% slapd 5417 root 1 44 0 9396K 1680K pfault 1 1:26 0.00% sshd 13147 root 1 44 0 3340K 860K vmwait 1 0:45 0.00% syslogd 92597 root 1 44 0 9396K 1676K pfault 1 0:39 0.00% sshd 26437 125 1 44 0 6924K 1700K vmwait 0 0:33 0.00% qmgr The above top was refreshing, but every other stuff on different ssh consoles (like a running zpool iostat and gstat) was frozen. Even top stopped when I have resized the window. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 16:07:26 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5DABA106566B; Thu, 8 Oct 2009 16:07:26 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id A13608FC13; Thu, 8 Oct 2009 16:07:25 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id DE49445E9C; Thu, 8 Oct 2009 18:07:23 +0200 (CEST) Received: from localhost (pdawidek.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 7A57D45E97; Thu, 8 Oct 2009 18:07:18 +0200 (CEST) Date: Thu, 8 Oct 2009 18:07:18 +0200 From: Pawel Jakub Dawidek To: Attila Nagy Message-ID: <20091008160718.GB2134@garage.freebsd.pl> References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> <4ACDA5EA.2010600@fsn.hu> <4ACDDED0.2070707@fsn.hu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="T4sUOijqQbZv57TR" Content-Disposition: inline In-Reply-To: <4ACDDED0.2070707@fsn.hu> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@FreeBSD.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 16:07:26 -0000 --T4sUOijqQbZv57TR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Oct 08, 2009 at 02:45:04PM +0200, Attila Nagy wrote: > Attila Nagy wrote: > >Hello, > > > >Pawel Jakub Dawidek wrote: > >>On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: > >>=20 > >>>Backing out this change from the 8-STABLE kernel: > >>>http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/u= ts/common/fs/zfs/arc.c?r1=3D191901&r2=3D191902=20 > >>> > >>> > >>>makes it survive about half and hour of IMAP searching. Of course=20 > >>>only time will tell whether this helps in the long run, but so far=20 > >>>10/10 tries succeeded to kill the machine with this method... > >>> =20 > >> > >>Could you try this patch: > >> > >> http://people.freebsd.org/~pjd/patches/arc.c.4.patch > >> =20 > >It seems (after running for two days) that this fixes my problem. And=20 > >I see that Kip has came out with a similar version (which I couldn't=20 > >yet test, but hope that will also do). > It seems that I was a little bit quick regarding this. > The machine just stopped with this: > last pid: 32358; load averages: 0.01, 0.04, 0.12 up 2+06:33:56 =20 > 14:36:25 > 114 processes: 1 running, 112 sleeping, 1 zombie > CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf > Swap: 4096M Total, 15M Used, 4081M Free >=20 > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 24025 root 1 44 0 3932K 992K vmwait 0 6:06 0.00% zpool > 84190 root 1 44 0 4700K 1592K CPU1 1 4:17 0.00% top > 99029 root 1 44 0 4132K 1212K nanslp 1 3:53 0.00% gstat > 26317 root 1 44 0 1528K 352K piperd 1 3:38 0.00%=20 > readproctitl > 49143 125 4 45 0 12248K 3788K sigwai 0 2:50 0.00%=20 > milter-greyl > 39969 root 1 44 0 1536K 516K vmwait 0 2:50 0.00% superv= ise > 40241 root 1 44 0 1536K 516K vmwait 0 2:47 0.00% superv= ise > 44633 root 1 44 0 1536K 512K vmwait 0 2:43 0.00% superv= ise > 43434 root 1 44 0 1536K 516K vmwait 0 2:43 0.00% superv= ise > 50575 root 1 44 0 1536K 516K vmwait 0 2:42 0.00% superv= ise > 45510 root 1 44 0 1536K 512K vmwait 0 2:42 0.00% superv= ise > 58146 60 1 44 0 264M 8828K pfault 0 2:32 0.00% imapd > 47526 389 6 44 0 92688K 2296K ucond 1 1:29 0.00% slapd > 5417 root 1 44 0 9396K 1680K pfault 1 1:26 0.00% sshd > 13147 root 1 44 0 3340K 860K vmwait 1 0:45 0.00% syslogd > 92597 root 1 44 0 9396K 1676K pfault 1 0:39 0.00% sshd > 26437 125 1 44 0 6924K 1700K vmwait 0 0:33 0.00% qmgr >=20 > The above top was refreshing, but every other stuff on different ssh=20 > consoles (like a running zpool iostat and gstat) was frozen. > Even top stopped when I have resized the window. Please try Kip's patch that was committed, it changes priorities a bit, which should help. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --T4sUOijqQbZv57TR Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFKzg42ForvXbEpPzQRAo+qAJ47m/rKCxrVyRIZvU7OkhvTTnzNsgCg4qQr kESbsclW6ojZ99eWuMu08Sc= =xAkD -----END PGP SIGNATURE----- --T4sUOijqQbZv57TR-- From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 16:52:45 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA82810656A3 for ; Thu, 8 Oct 2009 16:52:45 +0000 (UTC) (envelope-from bsdgroup.md@gmail.com) Received: from mail-ew0-f218.google.com (mail-ew0-f218.google.com [209.85.219.218]) by mx1.freebsd.org (Postfix) with ESMTP id A10BF8FC08 for ; Thu, 8 Oct 2009 16:52:43 +0000 (UTC) Received: by ewy18 with SMTP id 18so282092ewy.43 for ; Thu, 08 Oct 2009 09:52:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=+/Ox7ca4C5dGVZh+PB27CUs4LrmUzpmIBJ122S6dhI4=; b=Qjo2Q+wZy1S5tUcZg+R2sF7wpW21g9vA9LfHgOfTi3zQJMzzIie2FZMIs27ZTbG536 aCS0wOOWG4dF267SjQdHn4/t0PN3p6i9WERsw5mtcUTKf5bcih6KU3jEPYQ66xzlGjL2 q3SJCwyNw3a5kx9/txrp1/bUEho6JcQ2IxnDQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=YLfj0pAPV232/kmtZfMWL3QBuc2ULi2rlnExWHKNHf4N7moRZNQntDDPpvIL8uOp5s t+tWz+e1zDSbAdaS80nTE2mEtKMtP9MdLMhLtwbOFUDRon1mxhFvKqmwabLpkn8+tny6 QgtofQzchfVQm+3mCPbX8b18s0Ioh8SGQB1i4= MIME-Version: 1.0 Received: by 10.211.128.14 with SMTP id f14mr8445235ebn.75.1255018832686; Thu, 08 Oct 2009 09:20:32 -0700 (PDT) Date: Thu, 8 Oct 2009 19:20:32 +0300 Message-ID: From: Rusu Silviu To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Panic on writing new files to fusefs devices, mounted with ntfs-3g X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 16:52:45 -0000 Panic happens when bt client starting a download or copying a file > 4Gb. If the file to be downloaded already exists, there are no panic, and the bt client says file can not be found, however it is there. Repeated on 2 machines, an iCore workstation and an pentium mobile notebook both with 8.0 RC1. Thank you for any suggestions. Here are the dump from notebook: --------------------------------------------------------------- localhost dumped core - see /var/crash/vmcore.0 Thu Oct 8 19:29:41 UTC 2009 FreeBSD localhost 8.0-RC1 FreeBSD 8.0-RC1 #0: Thu Sep 17 20:45:19 UTC 2009 root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 panic: page fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0xe672bc44 frame pointer = 0x28:0xe672bc68 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1128 (rtorrent) trap number = 12 panic: page fault cpuid = 0 Uptime: 1m2s Physical memory: 1002 MB Dumping 56 MB: 41 25 9 Reading symbols from /usr/local/modules/fuse.ko...done. Loaded symbols for /usr/local/modules/fuse.ko #0 doadump () at pcpu.h:246 246 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:246 #1 0xc08823c7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0xc08826b9 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:579 #3 0xc0bb346c in trap_fatal (frame=0xe672bc04, eva=0) at /usr/src/sys/i386/i386/trap.c:933 #4 0xc0bb36f0 in trap_pfault (frame=0xe672bc04, usermode=0, eva=0) at /usr/src/sys/i386/i386/trap.c:846 #5 0xc0bb40d5 in trap (frame=0xe672bc04) at /usr/src/sys/i386/i386/trap.c:528 #6 0xc0b96a4b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #7 0x00000000 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) ------------------------------------------------------------------------ ps -axl Segmentation fault (core dumped) ------------------------------------------------------------------------ vmstat -s 0 cpu context switches 0 device interrupts 0 software interrupts 0 traps 0 system calls 0 kernel threads created 0 fork() calls 0 vfork() calls 0 rfork() calls 0 swap pager pageins 0 swap pager pages paged in 0 swap pager pageouts 0 swap pager pages paged out 0 vnode pager pageins 0 vnode pager pages paged in 0 vnode pager pageouts 0 vnode pager pages paged out 0 page daemon wakeups 0 pages examined by the page daemon 56 pages reactivated 0 copy-on-write faults 0 copy-on-write optimized faults 0 zero fill pages zeroed 0 zero fill pages prezeroed 0 intransit blocking page faults 0 total VM faults taken 0 pages affected by kernel thread creation 0 pages affected by fork() 0 pages affected by vfork() 0 pages affected by rfork() 66 pages cached 0 pages freed 0 pages freed by daemon 38835 pages freed by exiting processes 2980 pages active 1820 pages inactive 10 pages in VM cache 5745 pages wired down 241343 pages free 4096 bytes per page 16186 total name lookups cache hits (83% pos + 10% neg) system 0% per-directory deletions 0%, falsehits 0%, toolong 0% ------------------------------------------------------------------------ vmstat -m Type InUse MemUse HighUse Requests Size(s) pfs_nodes 20 3K - 20 128 GEOM 101 18K - 633 16,32,64,128,512,1024,2048 acpi_perf 1 1K - 1 256 isadev 6 1K - 6 64 sbp 104 9K - 104 32,128 cdev 11 2K - 11 128 sigio 1 1K - 1 32 filedesc 46 12K - 1131 16,256,512 kenv 76 7K - 80 16,32,64,128,4096 kqueue 0 0K - 4 128,1024 proc-args 21 2K - 309 32,64,128 ithread 64 5K - 64 16,64,128 agp 1 1K - 1 16 acpica 2120 107K - 80694 16,32,64,128,256,512,1024 KTRACE 100 13K - 100 128 linker 116 76K - 149 16,32,256,1024,4096 lockf 17 1K - 17 32,64 ip6ndp 6 1K - 6 64,128 temp 25 229K - 4956 16,32,64,128,256,512,1024,2048,4096 devbuf 2485 3683K - 2516 16,32,64,128,256,512,1024,2048,4096 acpitask 1 1K - 1 1024 module 465 30K - 465 64,128 CAM SIM 2 1K - 2 128 mtx_pool 2 8K - 2 4096 kbdmux 6 10K - 6 16,256,1024,2048,4096 subproc 93 189K - 1178 256,4096 proc 2 8K - 2 4096 session 17 2K - 17 64 pgrp 19 2K - 20 64 cred 26 3K - 2670 64,128 uidinfo 2 2K - 2 64,1024 plimit 11 3K - 148 256 sysctltmp 0 0K - 228 16,32,64,128 sysctloid 3188 96K - 3277 16,32,64,128 sysctl 0 0K - 301 16,32,64 umtx 98 7K - 98 64 p1003.1b 1 1K - 1 16 SWAP 2 277K - 2 64 bus-sc 76 158K - 3769 16,32,64,128,256,512,1024,2048,4096 bus 1184 54K - 6575 16,32,64,128,512,1024 devstat 8 17K - 8 16,4096 eventhandler 69 4K - 69 32,64,128 kobj 326 652K - 396 2048 Per-cpu 1 1K - 1 16 rman 194 12K - 684 16,32,64 acpisem 17 2K - 17 64,128 sbuf 0 0K - 368 16,32,64,128,256,512,1024,2048,4096 CAM XPT 17 7K - 89 16,32,64,1024,2048 stack 0 0K - 2 128 taskqueue 13 1K - 13 16,64 Unitno 11 1K - 29 16,64 iov 0 0K - 4520 64,128,256 select 8 1K - 8 64 ioctlops 0 0K - 803 16,32,64,128,256,512,1024 msg 4 25K - 4 1024,4096 sem 4 6K - 4 256,512,1024,4096 shm 1 12K - 1 tty 20 10K - 22 512,2048 mbuf_tag 0 0K - 1 32 ksem 1 4K - 1 4096 shmfd 1 4K - 1 4096 CAM periph 2 1K - 18 16,32,64,128 pcb 36 79K - 43 16,64,512,1024,2048,4096 soname 3 1K - 76 16,32,128 biobuf 4 8K - 6 2048 vfscache 1 512K - 1 vfs_hash 1 256K - 1 vnodes 2 1K - 2 128 vnodemarker 0 0K - 26 512 mount 94 3K - 138 16,32,64,128,256 BPF 5 1K - 5 64 ether_multi 13 1K - 14 16,32,64 ifaddr 48 10K - 48 16,32,64,128,256,512,2048 ifnet 6 6K - 6 64,1024 clone 5 20K - 5 4096 arpcom 2 1K - 2 16 fw_com 1 1K - 1 64 lltable 14 4K - 14 128,256 ata_generic 2 2K - 2 1024 ad_driver 1 1K - 1 32 ar_driver 0 0K - 6 512,2048 acd_driver 1 2K - 1 2048 routetbl 53 259K - 99 16,32,64,128,256,512 igmp 5 1K - 5 128 acpidev 81 3K - 81 32 in_multi 2 1K - 2 128 sctp_iter 0 0K - 3 128 sctp_ifn 2 1K - 2 128 sctp_ifa 4 1K - 4 128 sctp_vrf 1 1K - 1 64 sctp_a_it 0 0K - 3 16 hostcache 1 16K - 1 syncache 1 72K - 1 ppbusdev 3 1K - 3 128 in6_multi 9 1K - 9 16,256 mld 5 1K - 5 128 NFS FHA 1 1K - 1 1024 rpc 2 5K - 2 128,4096 audit_evclass 172 3K - 211 16 savedino 0 0K - 18 256 dirrem 4 1K - 21 32 mkdir 0 0K - 12 32 diradd 18 2K - 25 64 freefile 14 1K - 19 32 freeblks 12 3K - 15 256 freefrag 0 0K - 2 32 allocdirect 0 0K - 25 128 bmsafemap 0 0K - 7 64 newblk 1 1K - 26 64,256 inodedep 34 261K - 43 128 pagedep 3 33K - 11 64 ufs_dirhash 24 5K - 24 16,32,64,128,512 ufs_mount 12 25K - 12 256,2048,4096 entropy 1024 64K - 1024 64 CAM dev queue 2 1K - 2 64 vm_pgdata 2 65K - 2 64 fw_xfer 0 0K - 1 128 atkbddev 2 1K - 2 32 firewire 11 23K - 14 32,64,512,1024,2048,4096 UART 3 2K - 3 16,256,1024 apmdev 1 1K - 1 64 CAM queue 6 1K - 64 16,32 USBdev 16 5K - 16 32,128,1024 USB 28 5K - 28 16,32,64,1024 pci_link 16 2K - 16 32,128 DEVFS1 100 25K - 110 256 DEVFS3 120 15K - 130 128 memdesc 1 4K - 1 4096 nexusdev 3 1K - 3 16 DEVFS 16 1K - 17 16,64 fuse_messaging 5 2K - 16 128,256,512 fuse_filehandles 1 1K - 8 64 fuse_vnode 4 1K - 4 256 ------------------------------------------------------------------------ vmstat -z ITEM SIZE LIMIT USED FREE REQUESTS FAILURES UMA Kegs: 128, 0, 88, 2, 88, 0 UMA Zones: 888, 0, 88, 0, 88, 0 UMA Slabs: 284, 0, 646, 12, 1381, 0 UMA RCntSlabs: 544, 0, 195, 1, 195, 0 UMA Hash: 128, 0, 4, 26, 4, 0 16 Bucket: 76, 0, 27, 23, 48, 0 32 Bucket: 140, 0, 24, 4, 45, 0 64 Bucket: 268, 0, 15, 13, 61, 13 128 Bucket: 524, 0, 25, 3, 1026, 112 VM OBJECT: 136, 0, 716, 38, 12772, 0 MAP: 140, 0, 7, 21, 7, 0 KMAP ENTRY: 72, 56180, 29, 130, 3546, 0 MAP ENTRY: 72, 0, 370, 54, 23404, 0 DP fakepg: 72, 0, 0, 0, 0, 0 SG fakepg: 72, 0, 0, 0, 0, 0 mt_zone: 2056, 0, 267, 124, 267, 0 16: 16, 0, 2809, 236, 42604, 0 32: 32, 0, 2454, 32, 39391, 0 64: 64, 0, 4336, 148, 11346, 0 128: 128, 0, 2025, 45, 10703, 0 256: 256, 0, 588, 42, 3023, 0 512: 512, 0, 67, 5, 3754, 0 1024: 1024, 0, 35, 157, 1260, 0 2048: 2048, 0, 364, 20, 625, 0 4096: 4096, 0, 290, 15, 5429, 0 Files: 56, 0, 74, 127, 4349, 0 TURNSTILE: 72, 0, 99, 51, 99, 0 umtx pi: 52, 0, 0, 0, 0, 0 MAC labels: 20, 0, 0, 0, 0, 0 PROC: 680, 0, 43, 5, 1128, 0 THREAD: 572, 0, 81, 17, 81, 0 SLEEPQUEUE: 32, 0, 99, 78, 99, 0 VMSPACE: 232, 0, 21, 30, 1106, 0 cpuset: 40, 0, 2, 182, 2, 0 audit_record: 816, 0, 0, 0, 0, 0 mbuf_packet: 256, 0, 260, 137, 306, 0 mbuf: 256, 0, 6, 135, 126, 0 mbuf_cluster: 2048, 25600, 384, 6, 384, 0 mbuf_jumbo_page: 4096, 12800, 0, 0, 0, 0 mbuf_jumbo_9k: 9216, 19200, 0, 0, 0, 0 mbuf_jumbo_16k: 16384, 12800, 0, 0, 0, 0 mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0 g_bio: 140, 0, 0, 168, 4616, 0 ttyinq: 152, 0, 120, 36, 480, 0 ttyoutq: 256, 0, 64, 11, 256, 0 ata_request: 200, 0, 1, 56, 2187, 0 ata_composite: 180, 0, 0, 0, 0, 0 VNODE: 268, 0, 461, 29, 481, 0 VNODEPOLL: 60, 0, 0, 0, 0, 0 S VFS Cache: 72, 0, 435, 42, 903, 0 L VFS Cache: 292, 0, 0, 0, 0, 0 NAMEI: 1024, 0, 0, 8, 6749, 0 DIRHASH: 1024, 0, 35, 1, 35, 0 NFSMOUNT: 520, 0, 0, 0, 0, 0 NFSNODE: 464, 0, 0, 0, 0, 0 pipe: 392, 0, 1, 19, 738, 0 ksiginfo: 80, 0, 32, 1024, 32, 0 itimer: 220, 0, 0, 0, 0, 0 KNOTE: 68, 0, 0, 112, 4, 0 socket: 412, 25605, 32, 22, 182, 0 unpcb: 172, 25622, 6, 40, 17, 0 ipq: 32, 904, 0, 0, 0, 0 udp_inpcb: 220, 25614, 2, 34, 136, 0 udpcb: 8, 25781, 2, 201, 136, 0 tcp_inpcb: 220, 25614, 25, 29, 28, 0 tcpcb: 632, 25602, 24, 12, 28, 0 tcptw: 52, 5184, 1, 143, 1, 0 syncache: 112, 15365, 0, 0, 0, 0 hostcache: 76, 15400, 0, 0, 0, 0 tcpreass: 20, 1690, 0, 0, 0, 0 sackhole: 20, 0, 0, 0, 0, 0 sctp_ep: 848, 25600, 0, 0, 0, 0 sctp_asoc: 1460, 40000, 0, 0, 0, 0 sctp_laddr: 24, 80040, 0, 145, 3, 0 sctp_raddr: 420, 80001, 0, 0, 0, 0 sctp_chunk: 96, 400000, 0, 0, 0, 0 sctp_readq: 76, 400000, 0, 0, 0, 0 sctp_stream_msg_out: 64, 400020, 0, 0, 0, 0 sctp_asconf: 24, 400055, 0, 0, 0, 0 sctp_asconf_ack: 24, 400055, 0, 0, 0, 0 ripcb: 220, 25614, 0, 0, 0, 0 rtentry: 108, 0, 9, 63, 9, 0 selfd: 28, 0, 18, 236, 1793, 0 ip4flow: 40, 4140, 23, 161, 24, 0 ip6flow: 64, 4118, 0, 0, 0, 0 Mountpoints: 644, 0, 6, 6, 6, 0 FFS inode: 116, 0, 415, 14, 434, 0 FFS1 dinode: 128, 0, 0, 0, 0, 0 FFS2 dinode: 256, 0, 415, 20, 434, 0 SWAPMETA: 276, 121576, 0, 0, 0, 0 ------------------------------------------------------------------------ vmstat -i interrupt total rate irq0: clk 62481 2403 irq1: atkbd0 143 5 irq8: rtc 7997 307 irq11: cbb0 cbb1+* 92 3 irq14: ata0 1415 54 irq15: ata1 109 4 Total 72237 2778 ------------------------------------------------------------------------ pstat -T 74/12328 files 0M/1622M swap space ------------------------------------------------------------------------ pstat -s Device 512-blocks Used Avail Capacity /dev/ad0s2b 3322440 0 3322440 0% ------------------------------------------------------------------------ iostat iostat: kvm_read(_tk_nin): invalid address (0x0) iostat: disabling TTY statistics iostat: kvm_getcptime: invalid address (0x0) iostat: disabling CPU time statistics ad0 KB/t tps MB/s 16.88 44 0.72 ------------------------------------------------------------------------ ipcs -a Message Queues: T ID KEY MODE OWNER GROUP CREATOR CGROUP CBYTES QNUM QBYTES LSPID LRPID STIME RTIME CTIME Shared Memory: T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME Semaphores: T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME ------------------------------------------------------------------------ ipcs -T msginfo: msgmax: 16384 (max characters in a message) msgmni: 40 (# of message queues) msgmnb: 2048 (max characters in a message queue) msgtql: 40 (max # of messages in system) msgssz: 8 (size of a message segment) msgseg: 2048 (# of message segments in system) shminfo: shmmax: 33554432 (max shared memory segment size) shmmin: 1 (min shared memory segment size) shmmni: 192 (max number of shared memory identifiers) shmseg: 128 (max shared memory segments per process) shmall: 8192 (max amount of shared memory in pages) seminfo: semmap: 30 (# of entries in semaphore map) semmni: 10 (# of semaphore identifiers) semmns: 60 (# of semaphores in system) semmnu: 30 (# of undo structures in system) semmsl: 60 (max # of semaphores per id) semopm: 100 (max # of operations per semop call) semume: 10 (max # of undo entries per process) semusz: 136 (size in bytes of undo structure) semvmx: 32767 (semaphore maximum value) semaem: 16384 (adjust on exit max value) ------------------------------------------------------------------------ nfsstat Client Info: Rpc Counts: Getattr Setattr Lookup Readlink Read Write Create Remove 0 0 0 0 0 0 0 0 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access 0 0 0 0 0 0 0 0 Mknod Fsstat Fsinfo PathConf Commit 0 0 0 0 0 Rpc Info: TimedOut Invalid X Replies Retries Requests 0 0 0 0 0 Cache Info: Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses 0 0 0 0 0 0 0 0 BioRLHits Misses BioD Hits Misses DirE Hits Misses 0 0 0 0 0 0 Server Info: Getattr Setattr Lookup Readlink Read Write Create Remove 0 0 0 0 0 0 0 0 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access 0 0 0 0 0 0 0 0 Mknod Fsstat Fsinfo PathConf Commit 0 0 0 0 0 Server Ret-Failed 0 Server Faults 0 Server Cache Stats: Inprog Idem Non-idem Misses 0 0 0 0 Server Write Gathering: WriteOps WriteRPC Opsaved 0 0 0 ------------------------------------------------------------------------ netstat -s tcp: 69 packets sent 32 data packets (1637 bytes) 0 data packets (0 bytes) retransmitted 0 data packets unnecessarily retransmitted 0 resends initiated by MTU discovery 14 ack-only packets (1 delayed) 0 URG only packets 0 window probe packets 0 window update packets 23 control packets 45 packets received 38 acks (for 1521 bytes) 0 duplicate acks 0 acks for unsent data 21 packets (12121 bytes) received in-sequence 0 completely duplicate packets (0 bytes) 0 old duplicate packets 0 packets with some dup. data (0 bytes duped) 0 out-of-order packets (0 bytes) 0 packets (0 bytes) of data after window 0 window probes 1 window update packet 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 0 discarded due to memory problems 22 connection requests 0 connection accepts 0 bad connection attempts 0 listen queue overflows 0 ignored RSTs in the windows 10 connections established (including accepts) 3 connections closed (including 0 drops) 0 connections updated cached RTT on close 0 connections updated cached RTT variance on close 0 connections updated cached ssthresh on close 2 embryonic connections dropped 38 segments updated rtt (of 55 attempts) 0 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 0 Connections (fin_wait_2) dropped because of timeout 0 keepalive timeouts 0 keepalive probes sent 0 connections dropped by keepalive 0 correct ACK header predictions 4 correct data packet header predictions 0 syncache entries added 0 retransmitted 0 dupsyn 0 dropped 0 completed 0 bucket overflow 0 cache overflow 0 reset 0 stale 0 aborted 0 badack 0 unreach 0 zone failures 0 cookies sent 0 cookies received 0 SACK recovery episodes 0 segment rexmits in SACK recovery episodes 0 byte rexmits in SACK recovery episodes 0 SACK options (SACK blocks) received 0 SACK options (SACK blocks) sent 0 SACK scoreboard overflow 0 packets with ECN CE bit set 0 packets with ECN ECT(0) bit set 0 packets with ECN ECT(1) bit set 0 successful ECN handshakes 0 times ECN reduced the congestion window udp: 2 datagrams received 0 with incomplete header 0 with bad data length field 0 with bad checksum 0 with no checksum 0 dropped due to no socket 0 broadcast/multicast datagrams undelivered 0 dropped due to full socket buffers 0 not for hashed pcb 2 delivered 2 datagrams output 0 times multicast source filter matched ip: 47 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with ip length > max ip packet size 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped after timeout 0 packets reassembled ok 47 packets for this host 0 packets for unknown/unsupported protocol 0 packets forwarded (0 packets fast forwarded) 0 packets not forwardable 0 packets received for unknown multicast group 0 redirects sent 71 packets sent from this host 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 tunneling packets that can't find gif 0 datagrams with bad address in header icmp: 0 calls to icmp_error 0 errors not generated in response to an icmp message 0 messages with bad code fields 0 messages less than the minimum length 0 messages with bad checksum 0 messages with bad length 0 multicast echo requests ignored 0 multicast timestamp requests ignored 0 message responses generated 0 invalid return addresses 0 no return routes igmp: 0 messages received 0 messages received with too few bytes 0 messages received with wrong TTL 0 messages received with bad checksum 0 V1/V2 membership queries received 0 V3 membership queries received 0 membership queries received with invalid field(s) 0 general queries received 0 group queries received 0 group-source queries received 0 group-source queries dropped 0 membership reports received 0 membership reports received with invalid field(s) 0 membership reports received for groups to which we belong 0 V3 reports received without Router Alert 0 membership reports sent ip6: 0 total packets received 0 with size smaller than minimum 0 with data size < data length 0 with bad options 0 with incorrect version number 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped after timeout 0 fragments that exceeded limit 0 packets reassembled ok 0 packets for this host 0 packets forwarded 0 packets not forwardable 0 redirects sent 0 packets sent from this host 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 packets that violated scope rules 0 multicast packets which we don't join Mbuf statistics: 0 one mbuf 0 one ext mbuf 0 two or more ext mbuf 0 packets whose headers are not continuous 0 tunneling packets that can't find gif 0 packets discarded because of too many headers 0 failures of source address selection Source addresses selection rule applied: 1 first candidate 1 same address icmp6: 0 calls to icmp6_error 0 errors not generated in response to an icmp6 message 0 errors not generated because of rate limitation 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length Histogram of error messages to be generated: 0 no route 0 administratively prohibited 0 beyond scope 0 address unreachable 0 port unreachable 0 packet too big 0 time exceed transit 0 time exceed reassembly 0 erroneous header field 0 unrecognized next header 0 unrecognized option 0 redirect 0 unknown 0 message responses generated 0 messages with too many ND options 0 messages with bad ND options 0 bad neighbor solicitation messages 0 bad neighbor advertisement messages 0 bad router solicitation messages 0 bad router advertisement messages 0 bad redirect messages 0 path MTU changes rip6: 0 messages received 0 checksum calculations on inbound 0 messages with bad checksum 0 messages dropped due to no socket 0 multicast messages dropped due to no socket 0 messages dropped due to full socket buffers 0 delivered 0 datagrams output ------------------------------------------------------------------------ netstat -m 266/272/538 mbufs in use (current/cache/total) 247/143/390/25600 mbuf clusters in use (current/cache/total/max) 260/137 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/19200 9k jumbo clusters in use (current/cache/total/max) 0/0/0/12800 16k jumbo clusters in use (current/cache/total/max) 560K/354K/914K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines ------------------------------------------------------------------------ netstat -id Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll Drop bge0 1500 00:0b:5d:4c:4e:3f 48 0 73 0 0 0 bge0 1500 192.168.2.0 localhost 47 - 71 - - - fwe0* 1500 02:00:0e:90:a7:1d 0 0 0 0 0 0 fwip0 1500 00:00:0e:10:02:90:a7:1d:0a:02:ff:fe:00:00:00:00 0 0 0 0 0 0 plip0 1500 0 0 0 0 0 0 lo0 16384 0 0 0 0 0 0 lo0 16384 fe80:5::1 fe80:5::1 0 - 0 - - - lo0 16384 localhost ::1 0 - 0 - - - lo0 16384 your-net localhost 0 - 0 - - - ------------------------------------------------------------------------ netstat -anr Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 192.168.2.2 UGS 23 71 bge0 127.0.0.1 link#5 UH 0 0 lo0 192.168.2.0/24 link#1 U 0 0 bge0 192.168.2.5 link#5 UHS 0 0 lo0 Internet6: Destination Gateway Flags Netif Expire ::1 ::1 UH lo0 fe80::%lo0/64 link#5 U lo0 fe80::1%lo0 link#5 UHS lo0 ff01:5::/32 fe80::1%lo0 U lo0 ff02::%lo0/32 fe80::1%lo0 U lo0 ------------------------------------------------------------------------ netstat -anA Active Internet connections (including servers) Tcpcb Proto Recv-Q Send-Q Local Address Foreign Address (state) c4a63278 tcp4 0 0 192.168.2.5.31858 95.220.50.57.62773 SYN_SENT c4a634f0 tcp4 0 0 192.168.2.5.64367 89.185.88.245.5084 SYN_SENT c4a63768 tcp4 0 0 192.168.2.5.45874 193.53.83.21.44156 SYN_SENT c4a5d768 tcp4 0 0 192.168.2.5.34819 91.77.41.60.38168 SYN_SENT c4a5d9e0 tcp4 0 5 192.168.2.5.21376 85.202.230.199.615 ESTABLISHED c4a5dc58 tcp4 0 0 192.168.2.5.58411 95.54.221.192.4924 SYN_SENT c4a5f000 tcp4 0 0 192.168.2.5.59844 82.193.115.223.272 ESTABLISHED c4a5f278 tcp4 0 5 192.168.2.5.31068 78.60.226.227.3494 ESTABLISHED c4a5f4f0 tcp4 0 0 192.168.2.5.18169 77.41.19.177.45022 ESTABLISHED c4a5f768 tcp4 0 68 192.168.2.5.14422 92.124.160.82.2752 ESTABLISHED c4a5f9e0 tcp4 0 0 192.168.2.5.14207 94.19.183.244.1592 SYN_SENT c4a5fc58 tcp4 0 0 192.168.2.5.63918 80.92.96.70.44250 SYN_SENT c4894278 tcp4 0 0 192.168.2.5.58835 83.167.115.93.3549 SYN_SENT c48944f0 tcp4 0 0 192.168.2.5.55091 89.163.99.97.24259 SYN_SENT c4894768 tcp4 0 0 192.168.2.5.13778 79.165.161.236.514 ESTABLISHED c48949e0 tcp4 0 5 192.168.2.5.63650 78.37.127.202.3866 ESTABLISHED c4a5d000 tcp4 0 34 192.168.2.5.47597 77.220.58.88.24238 ESTABLISHED c4a5d278 tcp4 0 0 192.168.2.5.46432 82.193.97.92.35975 SYN_SENT c4a5d4f0 tcp4 5328 0 192.168.2.5.59401 92.242.90.182.2010 ESTABLISHED c4a67000 tcp4 0 0 192.168.2.5.49951 195.82.146.121.80 TIME_WAIT c4893278 tcp4 0 0 *.6919 *.* LISTEN c48934f0 tcp4 0 0 *.21 *.* LISTEN c4893768 tcp6 0 0 *.21 *.* LISTEN c4893c58 tcp4 0 0 *.22 *.* LISTEN c4894000 tcp6 0 0 *.22 *.* LISTEN c46db6e0 udp4 0 0 *.514 *.* c46db7bc udp6 0 0 *.514 *.* Active UNIX domain sockets Address Type Recv-Q Send-Q Inode Conn Refs Nextref Addr c46deec8 stream 0 0 c487b430 0 0 0 /var/run/devd.pipe c46de4b4 dgram 0 0 0 c46de8bc 0 c46de560 c46de560 dgram 0 0 0 c46de8bc 0 c46de60c c46de60c dgram 0 0 0 c46de8bc 0 0 c46de8bc dgram 0 0 c4882b84 0 c46de4b4 0 /var/run/logpriv c46de968 dgram 0 0 c4882c90 0 0 0 /var/run/log ------------------------------------------------------------------------ netstat -aL Current listen queue sizes (qlen/incqlen/maxqlen) Proto Listen Local Address tcp4 0/0/50 *.6919 tcp4 0/0/32 *.ftp tcp6 0/0/32 *.ftp tcp4 0/0/128 *.ssh tcp6 0/0/128 *.ssh unix 0/0/4 /var/run/devd.pipe ------------------------------------------------------------------------ fstat Segmentation fault ------------------------------------------------------------------------ dmesg Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-RC1 #0: Thu Sep 17 20:45:19 UTC 2009 root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) M processor 1.60GHz (1600.06-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6d6 Stepping = 6 Features=0xafe9f9bf Features2=0x180 real memory = 1073741824 (1024 MB) avail memory = 1027059712 (979 MB) kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xfc08-0xfc0b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.1 (no driver attached) pci0: at device 0.3 (no driver attached) vgapci0: port 0x2450-0x2457 mem 0xd8000000-0xdfffffff,0xd0000000-0xd007ffff irq 11 at device 2.0 on pci0 agp0: on vgapci0 agp0: detected 8060k stolen memory agp0: aperture size is 128M vgapci1: mem 0xe0000000-0xe7ffffff,0xd0080000-0xd00fffff at device 2.1 on pci0 uhci0: port 0x20c0-0x20df irq 11 at device 29.0 on pci0 uhci0: [ITHREAD] uhci0: LegSup = 0x2f00 usbus0: on uhci0 uhci1: port 0x20e0-0x20ff irq 11 at device 29.1 on pci0 uhci1: [ITHREAD] uhci1: LegSup = 0x2f00 usbus1: on uhci1 uhci2: port 0x2400-0x241f irq 11 at device 29.2 on pci0 uhci2: [ITHREAD] uhci2: LegSup = 0x2f00 usbus2: on uhci2 ehci0: mem 0xd0100000-0xd01003ff irq 11 at device 29.7 on pci0 ehci0: [ITHREAD] usbus3: EHCI version 1.0 usbus3: on ehci0 pcib1: at device 30.0 on pci0 pci1: on pcib1 cbb0: irq 11 at device 10.0 on pci1 cardbus0: on cbb0 pccard0: <16-bit PCCard bus> on cbb0 cbb0: [FILTER] cbb1: irq 11 at device 10.1 on pci1 cardbus1: on cbb1 pccard1: <16-bit PCCard bus> on cbb1 cbb1: [FILTER] pci1: at device 10.2 (no driver attached) cbb2: at device 10.3 on pci1 cardbus2: on cbb2 pccard2: <16-bit PCCard bus> on cbb2 cbb2: [FILTER] bge0: mem 0xd0200000-0xd020ffff irq 11 at device 12.0 on pci1 miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: 00:0b:5d:4c:4e:3f bge0: [ITHREAD] pci1: at device 13.0 (no driver attached) fwohci0: mem 0xd0216000-0xd02167ff,0xd0210000-0xd0213fff irq 11 at device 14.0 on pci1 fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:00:0e:10:02:90:a7:1d fwohci0: Phy 1394a available S400, 1 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x3e4a4000 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:00:0e:90:a7:1d fwe0: Ethernet address: 02:00:0e:90:a7:1d fwip0: on firewire0 fwip0: Firewire address: 00:00:0e:10:02:90:a7:1d @ 0xfffe00000000, S400, maxrec 2048 sbp0: on firewire0 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x2440-0x244f at device 31.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 31.3 (no driver attached) pci0: at device 31.5 (no driver attached) pci0: at device 31.6 (no driver attached) acpi_button0: on acpi0 acpi_lid0: on acpi0 acpi_acad0: on acpi0 battery0: on acpi0 battery1: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model Generic PS/2 mouse, device ID 0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] ppc0: port 0x378-0x37f,0x778-0x77b irq 7 on acpi0 ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode ppc0: [ITHREAD] ppbus0: on ppc0 plip0: on ppbus0 plip0: [ITHREAD] lpt0: on ppbus0 lpt0: [ITHREAD] lpt0: Interrupt-driven port ppi0: on ppbus0 cpu0: on acpi0 est0: on cpu0 p4tcc0: on cpu0 pmtimer0 on isa0 orm0: at iomem 0xdc000-0xdffff pnpid ORM0000 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 1600061907 Hz quality 800 Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me) firewire0: bus manager 0 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 480Mbps High Speed USB v2.0 ad0: 76319MB at ata0-master UDMA100 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 acd0: DVDR at ata1-master UDMA33 GEOM: ad0s2: geometry does not match label (255h,63s != 16h,63s). uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub3: 6 ports with 6 removable, self powered Trying to mount root from ufs:/dev/ad0s2a Entropy harvesting: interrupts ethernet point_to_point kickstart . /dev/ad0s2a: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s2a: clean, 145310 free (3262 frags, 17756 blocks, 1.4% fragmentation) /dev/ad0s2e: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s2e: clean, 215633 free (33 frags, 26950 blocks, 0.0% fragmentation) /dev/ad0s2f: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s2f: clean, 2754824 free (28080 frags, 340843 blocks, 0.8% fragmentation) /dev/ad0s2d: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s2d: clean, 374758 free (62 frags, 46837 blocks, 0.0% fragmentation) Starting fusefs. fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8 Starting Network: lo0 bge0. add net default: gateway 192.168.2.2 Configuring syscons: blanktime . bge0: link state changed to UP Oct 8 19:27:29 localhost sm-mta[1038]: My unqualified host name (localhost) unknown; sleeping for retry Script /etc/rc.d/sendmail interrupted Thu Oct 8 19:27:37 UTC 2009 Oct 8 19:27:42 localhost login: ROOT LOGIN (root) ON ttyv0 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0xe672bc44 frame pointer = 0x28:0xe672bc68 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1128 (rtorrent) trap number = 12 panic: page fault cpuid = 0 Uptime: 1m2s Physical memory: 1002 MB Dumping 56 MB: 41 25 9 ------------------------------------------------------------------------ kernel config config: File /boot/kernel/kernel doesn't contain configuration file. Either unsupported, or not compiled with INCLUDE_CONFIG_FILE From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 17:06:28 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97C7D1065697; Thu, 8 Oct 2009 17:06:28 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yx0-f191.google.com (mail-yx0-f191.google.com [209.85.210.191]) by mx1.freebsd.org (Postfix) with ESMTP id 3C4F08FC15; Thu, 8 Oct 2009 17:06:27 +0000 (UTC) Received: by yxe29 with SMTP id 29so55139yxe.14 for ; Thu, 08 Oct 2009 10:06:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=qWW3Vql9oNaAu6ORq7vYIk2Vr/V50MEMVkI3WEjwGfM=; b=l+9zNecT9p7xNNfSn3mwaXQh8zEx7sWsfMZHQqC+0n58ECrPZz7vPxfPHFyxoKKMMv aWzV+kye2HYam6YK8LZYqE+a/pHoA9a8mC00tB5BuqM2S5LeuYx0ijsOHuuNInsLf6xe sZdfvpRa+ZoTMlg+Mib0LnV/X07c6BIICXmKo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=LDsHMQb2yhCkZ3YpvsHAaSCrHAkaC9DrX7e89PE8VyfIMhmgg+27ief5LXaoHmMwWk xmUpiyb2jbM4yZdTIuJTHX7eTN1oXxffCpDkhvKACGG74bbmdVNcHFXLmdAh1EjqPvcD quBZAyoWnUgkMFLCVbqgcs78c6b6tfMpCZvAI= MIME-Version: 1.0 Sender: artemb@gmail.com Received: by 10.91.97.9 with SMTP id z9mr844728agl.46.1255021587252; Thu, 08 Oct 2009 10:06:27 -0700 (PDT) In-Reply-To: <20091008160718.GB2134@garage.freebsd.pl> References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> <4ACDA5EA.2010600@fsn.hu> <4ACDDED0.2070707@fsn.hu> <20091008160718.GB2134@garage.freebsd.pl> Date: Thu, 8 Oct 2009 10:06:27 -0700 X-Google-Sender-Auth: 318ccc6cccd49740 Message-ID: From: Artem Belevich To: Pawel Jakub Dawidek Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 17:06:28 -0000 I've tested with Kip's patch -- no lockups so far. --Artem On Thu, Oct 8, 2009 at 9:07 AM, Pawel Jakub Dawidek wrote= : > On Thu, Oct 08, 2009 at 02:45:04PM +0200, Attila Nagy wrote: >> Attila Nagy wrote: >> >Hello, >> > >> >Pawel Jakub Dawidek wrote: >> >>On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: >> >> >> >>>Backing out this change from the 8-STABLE kernel: >> >>>http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/= uts/common/fs/zfs/arc.c?r1=3D191901&r2=3D191902 >> >>> >> >>> >> >>>makes it survive about half and hour of IMAP searching. Of course >> >>>only time will tell whether this helps in the long run, but so far >> >>>10/10 tries succeeded to kill the machine with this method... >> >>> >> >> >> >>Could you try this patch: >> >> >> >> =A0 =A0http://people.freebsd.org/~pjd/patches/arc.c.4.patch >> >> >> >It seems (after running for two days) that this fixes my problem. And >> >I see that Kip has came out with a similar version (which I couldn't >> >yet test, but hope that will also do). >> It seems that I was a little bit quick regarding this. >> The machine just stopped with this: >> last pid: 32358; =A0load averages: =A00.01, =A00.04, =A00.12 =A0 =A0up 2= +06:33:56 >> 14:36:25 >> 114 processes: 1 running, 112 sleeping, 1 zombie >> CPU: =A00.0% user, =A00.0% nice, =A00.0% system, =A00.0% interrupt, =A01= 00% idle >> Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf >> Swap: 4096M Total, 15M Used, 4081M Free >> >> =A0PID USERNAME =A0THR PRI NICE =A0 SIZE =A0 =A0RES STATE =A0 C =A0 TIME= =A0 WCPU COMMAND >> 24025 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A03932K =A0 992K vmwait =A00= =A0 6:06 =A00.00% zpool >> 84190 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A04700K =A01592K CPU1 =A0 = =A01 =A0 4:17 =A00.00% top >> 99029 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A04132K =A01212K nanslp =A01= =A0 3:53 =A00.00% gstat >> 26317 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01528K =A0 352K piperd =A01= =A0 3:38 =A00.00% >> readproctitl >> 49143 =A0 =A0125 =A0 =A0 =A04 =A045 =A0 =A00 12248K =A03788K sigwai =A00= =A0 2:50 =A00.00% >> milter-greyl >> 39969 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 516K vmwait =A00= =A0 2:50 =A00.00% supervise >> 40241 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 516K vmwait =A00= =A0 2:47 =A00.00% supervise >> 44633 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 512K vmwait =A00= =A0 2:43 =A00.00% supervise >> 43434 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 516K vmwait =A00= =A0 2:43 =A00.00% supervise >> 50575 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 516K vmwait =A00= =A0 2:42 =A00.00% supervise >> 45510 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A01536K =A0 512K vmwait =A00= =A0 2:42 =A00.00% supervise >> 58146 =A0 =A0 60 =A0 =A0 =A01 =A044 =A0 =A00 =A0 264M =A08828K pfault = =A00 =A0 2:32 =A00.00% imapd >> 47526 =A0 =A0389 =A0 =A0 =A06 =A044 =A0 =A00 92688K =A02296K ucond =A0 1= =A0 1:29 =A00.00% slapd >> 5417 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A09396K =A01680K pfault =A01 = =A0 1:26 =A00.00% sshd >> 13147 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A03340K =A0 860K vmwait =A01= =A0 0:45 =A00.00% syslogd >> 92597 root =A0 =A0 =A0 =A01 =A044 =A0 =A00 =A09396K =A01676K pfault =A01= =A0 0:39 =A00.00% sshd >> 26437 =A0 =A0125 =A0 =A0 =A01 =A044 =A0 =A00 =A06924K =A01700K vmwait = =A00 =A0 0:33 =A00.00% qmgr >> >> The above top was refreshing, but every other stuff on different ssh >> consoles (like a running zpool iostat and gstat) was frozen. >> Even top stopped when I have resized the window. > > Please try Kip's patch that was committed, it changes priorities a bit, > which should help. > > -- > Pawel Jakub Dawidek =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 http://ww= w.wheel.pl > pjd@FreeBSD.org =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 http:= //www.FreeBSD.org > FreeBSD committer =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Am I Ev= il? Yes, I Am! > From owner-freebsd-fs@FreeBSD.ORG Thu Oct 8 20:14:02 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D686A1065670; Thu, 8 Oct 2009 20:14:02 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id AE0248FC18; Thu, 8 Oct 2009 20:14:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n98KE2ii010672; Thu, 8 Oct 2009 20:14:02 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n98KE2C9010668; Thu, 8 Oct 2009 20:14:02 GMT (envelope-from linimon) Date: Thu, 8 Oct 2009 20:14:02 GMT Message-Id: <200910082014.n98KE2C9010668@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/139440: [ntfs] [panic] 8.0 RC1 panics on writing large files to ntfs-3g mounted partitions X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 20:14:02 -0000 Old Synopsis: 8.0 RC1 panics on writing large files to ntfs-3g mounted partitions New Synopsis: [ntfs] [panic] 8.0 RC1 panics on writing large files to ntfs-3g mounted partitions Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Thu Oct 8 20:13:50 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=139440 From owner-freebsd-fs@FreeBSD.ORG Sat Oct 10 08:54:27 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64DA8106568F; Sat, 10 Oct 2009 08:54:27 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 3AED88FC0C; Sat, 10 Oct 2009 08:54:27 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n9A8sQgJ043642; Sat, 10 Oct 2009 08:54:26 GMT (envelope-from gavin@freefall.freebsd.org) Received: (from gavin@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n9A8sQ4T043638; Sat, 10 Oct 2009 08:54:26 GMT (envelope-from gavin) Date: Sat, 10 Oct 2009 08:54:26 GMT Message-Id: <200910100854.n9A8sQ4T043638@freefall.freebsd.org> To: gavin@FreeBSD.org, freebsd-fs@FreeBSD.org, freebsd-ports-bugs@FreeBSD.org From: gavin@FreeBSD.org Cc: Subject: Re: ports/139440: [panic] 8.0 RC1 panics on writing large files to sysutils/fusefs-ntfs mounted partitions X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Oct 2009 08:54:27 -0000 Old Synopsis: [ntfs] [panic] 8.0 RC1 panics on writing large files to ntfs-3g mounted partitions New Synopsis: [panic] 8.0 RC1 panics on writing large files to sysutils/fusefs-ntfs mounted partitions Responsible-Changed-From-To: freebsd-fs->freebsd-ports-bugs Responsible-Changed-By: gavin Responsible-Changed-When: Sat Oct 10 08:53:18 UTC 2009 Responsible-Changed-Why: This appears to be a bug with the ntfs-3g fuse port (or fuse itself) http://www.freebsd.org/cgi/query-pr.cgi?pr=139440 From owner-freebsd-fs@FreeBSD.ORG Sat Oct 10 13:23:31 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A66DE106568D for ; Sat, 10 Oct 2009 13:23:31 +0000 (UTC) (envelope-from pepelac@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.26]) by mx1.freebsd.org (Postfix) with ESMTP id 3CC948FC16 for ; Sat, 10 Oct 2009 13:23:31 +0000 (UTC) Received: by ey-out-2122.google.com with SMTP id 4so1780218eyf.9 for ; Sat, 10 Oct 2009 06:23:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=JyFauzeV6cO6k/xa5RGOd/RsH9bHjJC0WVrVrYMHWgA=; b=q41C5f/BxohlPxJWE3lscfpX6gDNZIu+KE/A3gDCXgD9ZLO6I0bI1S+EDfOce3e6ZQ Wq9ATxHt90v+jrIHdk6ieT3XQlhg6AyHNgIgYfGkqGJlFjA24//UqYybwMALCt8SgyzQ qhOXIh3N7q5Q6zhDvGJIAoWsZ9i+c08+7f2eA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=mepMBOW39fvVtPu3zOaxZJUohUgvci1F2HYLPqbpVQ2mAGp7h744uiilrOWDqy4SNK yHnaur2+JLN942xxxitWafOD/zkgSwQt1y1V2kdKkgzAhLMTSl7dsbm3KvhUNNg78PFX KqqRWTJCvSIna1BQ7gytm/10XXyQ/E0Psz8zE= MIME-Version: 1.0 Received: by 10.211.132.3 with SMTP id j3mr4538261ebn.81.1255181010286; Sat, 10 Oct 2009 06:23:30 -0700 (PDT) Date: Sat, 10 Oct 2009 17:23:30 +0400 Message-ID: <8c9ae7950910100623if9c5149id9af6a5cfdbb3697@mail.gmail.com> From: Alexander Shevchenko To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: missing kstat.arcstat.l2_read_bytes X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Oct 2009 13:23:31 -0000 Good time of day! Comparing Solaris 10u8(ZFS pool version 15) and FreeBSD 8RC(ZFS pool version 13)1 kstat output, i'v found that FreeBSD missing kernel variable kstat.zfs.misc.arcstats.l2_read_bytes. It exists in OpenSolaris source http://fxr.watson.org/fxr/source//common/fs/zfs/arc.c?v=OPENSOLARIS but missing FreeBSD HEAD source http://fxr.watson.org/fxr/source//cddl/contrib/opensolaris/uts/common/fs/zfs/arc.cIs it a bug or a feature? WBR, Alexander Shevchenko From owner-freebsd-fs@FreeBSD.ORG Sat Oct 10 15:58:57 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DD081065679 for ; Sat, 10 Oct 2009 15:58:57 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206049004.chello.pl [87.206.49.4]) by mx1.freebsd.org (Postfix) with ESMTP id 3A9D18FC12 for ; Sat, 10 Oct 2009 15:58:56 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id C3E5945CA6; Sat, 10 Oct 2009 17:58:53 +0200 (CEST) Received: from localhost (abhy128.neoplus.adsl.tpnet.pl [83.7.114.128]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id A2E3245685; Sat, 10 Oct 2009 17:58:47 +0200 (CEST) Date: Sat, 10 Oct 2009 17:58:46 +0200 From: Pawel Jakub Dawidek To: Alexander Shevchenko Message-ID: <20091010155846.GA1756@garage.freebsd.pl> References: <8c9ae7950910100623if9c5149id9af6a5cfdbb3697@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="u3/rZRmxL6MmkK24" Content-Disposition: inline In-Reply-To: <8c9ae7950910100623if9c5149id9af6a5cfdbb3697@mail.gmail.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: missing kstat.arcstat.l2_read_bytes X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Oct 2009 15:58:57 -0000 --u3/rZRmxL6MmkK24 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Oct 10, 2009 at 05:23:30PM +0400, Alexander Shevchenko wrote: > Good time of day! >=20 > Comparing Solaris 10u8(ZFS pool version 15) and FreeBSD 8RC(ZFS pool vers= ion > 13)1 kstat output, i'v found that FreeBSD missing kernel variable > kstat.zfs.misc.arcstats.l2_read_bytes. It exists in OpenSolaris source > http://fxr.watson.org/fxr/source//common/fs/zfs/arc.c?v=3DOPENSOLARIS but > missing FreeBSD HEAD source > http://fxr.watson.org/fxr/source//cddl/contrib/opensolaris/uts/common/fs/= zfs/arc.cIs > it a bug or a feature? Neither, you are simply comparing two different versions. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --u3/rZRmxL6MmkK24 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFK0K82ForvXbEpPzQRAkrSAJ0VAAayWSru3JATU8yIr1Lb4+f7ewCgnC+l WJoj0VU0sPDX4C3IDGZ95sk= =z48K -----END PGP SIGNATURE----- --u3/rZRmxL6MmkK24--