Date: Sun, 24 Nov 2013 11:29:06 -0500 From: Eitan Adler <lists@eitanadler.com> To: Thomas Steen Rasmussen <thomas@gibfest.dk> Cc: "freebsd-fs@freebsd.org" <fs@freebsd.org> Subject: Re: ZFS (or something) is absurdly slow Message-ID: <CAF6rxgnm1zUHeN8%2BjtVrqTfr7vhJ23=%2BFzGZsbsOHAXEw46hYA@mail.gmail.com> In-Reply-To: <5291B2CC.2040907@gibfest.dk> References: <CAF6rxgmepSN9pFPH%2BQiLaNqhzXxkXwu=59zvfD-6gGEMg9zh1g@mail.gmail.com> <5290E0CF.20704@gibfest.dk> <CAF6rxgkPVDnmS1RTugLVYUP3H6=Azjx%2BsgYv91_6d0yAg6Gthw@mail.gmail.com> <5291B2CC.2040907@gibfest.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 24, 2013 at 3:03 AM, Thomas Steen Rasmussen <thomas@gibfest.dk> wrote: > On 24-11-2013 04:15, Eitan Adler wrote: >> >> >>> vfsstat.d https://forums.freebsd.org/showpost.php?p=182070&postcount=6 >> >> I can run this script, what output should I be looking for? > > > Check the sample output on the page: It shows two lists, "Number > of operations" and "Bytes read or write". The lists are ordered > with the busiest at the bottom and seperated by filesystem > location. While things are slow, try running it to see if some > location on the filesystem is being hammered..... I will do so. > Another thing you should probably do is run a SMART check on the > disk to see if something is wrong with it. See the complete output below. The only thing which stands out to me is: 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 19275 > I had another case with > a zfs mirror that performed appalingly, turned out it was because > one of the disks was dodgy, not in a way that made zfs show > checksum errors, but enough to make it really really slow. Since a > ZFS vdev only performs as good as the slowest disk in a vdev, > which in turn will slow the whole pool down, replacing the disk > made everything much better. =============== smartctl 6.2 2013-07-26 r3841 [FreeBSD 11.0-CURRENT amd64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Momentus SpinPoint M8 (AF) Device Model: ST1000LM024 HN-M101MBB Serial Number: S2U5J9FCB79134 LU WWN Device Id: 5 0004cf 20904e7cf Firmware Version: 2AR10001 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sun Nov 24 11:23:21 2013 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (12780) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 213) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 25 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 089 089 025 Pre-fail Always - 3453 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 158 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 7285 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 280 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 180 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 155 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 055 047 000 Old_age Always - 45 (Min/Max 18/63) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 19275 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 280 225 Load_Cycle_Count 0x0032 091 091 000 Old_age Always - 95664 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Completed [00% left] (0-65535) 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. ===================== > What does diskinfo -ct /dev/whatever say about the seek times on the > bad disk ? [10005 root@gravity (100%) /home/eitan !2!]#diskinfo -ct /dev/ada1 /dev/ada1 512 # sectorsize 1000204886016 # mediasize in bytes (932G) 1953525168 # mediasize in sectors 4096 # stripesize 0 # stripeoffset 1938021 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. S2U5J9FCB79134 # Disk ident. I/O command overhead: time to read 10MB block 0.099713 sec = 0.005 msec/sector time to read 20480 sectors 1.447996 sec = 0.071 msec/sector calculated command overhead = 0.066 msec/sector Seek times: Full stroke: 250 iter in 8.036950 sec = 32.148 msec Half stroke: 250 iter in 5.463750 sec = 21.855 msec Quarter stroke: 500 iter in 10.542506 sec = 21.085 msec Short forward: 400 iter in 5.707363 sec = 14.268 msec Short backward: 400 iter in 4.645333 sec = 11.613 msec Seq outer: 2048 iter in 0.096977 sec = 0.047 msec Seq inner: 2048 iter in 1.853596 sec = 0.905 msec Transfer rates: outside: 102400 kbytes in 0.949048 sec = 107898 kbytes/sec middle: 102400 kbytes in 1.659245 sec = 61715 kbytes/sec inside: 102400 kbytes in 2.020322 sec = 50685 kbytes/sec > Are the results the same if you boot off of an usb stick and > test the disk when it is completely idle and independent of the running OS ? Good question. I can not check this at the moment. -- Eitan Adler
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAF6rxgnm1zUHeN8%2BjtVrqTfr7vhJ23=%2BFzGZsbsOHAXEw46hYA>