Date: Fri, 7 Jun 2013 14:37:47 +0200 (CEST) From: "Pascal Braun, Continum" <pascal.braun@continum.net> To: Jeremy Chadwick <jdc@koitsu.org> Cc: freebsd-stable@freebsd.org, Ronald Klop <ronald-freebsd8@klop.yi.org> Subject: Re: ZFS crashing while zfs recv in progress Message-ID: <1290657146.201020.1370608667469.JavaMail.root@continum.net> In-Reply-To: <20130604095300.GA79993@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
first I'd like to thank you for your time and effort.
> - Disk da3 has a different drive firmware (A580) than the A800
> drives.
Somehow I did miss that. I can replace this disk with a A800 one, although I don't think this will change much.
> - I have not verified if any of these disks use 4KByte sectors (dmesg
> is
> not going to tell you the entire truth). I would appreciate seeing
> "smartctl -x" output from {da0,da1,da3} so I could get an idea.
> Your
> pools use gpt labelling so I am left with the hope that your labels
> refer to the partition with proper 4KB alignment regardless.
The 'tank' disks are real 512bytes disks. The zpool currently in use is ashift=9. I've also tried ashift=12 in the past, but it didn't help. You'll find the output of smartctl in the attachment.
> Can you tell me what exact disk (e.g. daXX) in the above list you
> used
> for swap, and what kind of both system and disk load were going on at
> the time you saw the swap message?
>
> I'm looking for a capture of "gstat -I500ms" output (you will need a
> VERY long/big terminal window to capture this given how many disks
> you
> have) while I/O is happening, as well as "top -s 1" in another
> window.
> I would also like to see "zpool iostat -v 1" output while things are
> going on, to help possibly narrow down if there is a single disk
> causing
> the entire I/O subsystem for that controller to choke.
The swap disk in use is da28.
The last output of top -s 1 that could be writen to disk was:
---
last pid: 3653; load averages: 0.03, 0.19, 0.30 up 0+15:55:50 03:04:33
43 processes: 1 running, 41 sleeping, 1 zombie
CPU: 0.3% user, 0.0% nice, 0.6% system, 0.1% interrupt, 99.0% idle
Mem: 7456K Active, 27M Inact, 6767M Wired, 3404K Cache, 9053M Free
Swap: 256G Total, 5784K Used, 256G Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1917 root 1 22 0 33420K 2356K piperd 2 41:24 3.96% zfs
1913 root 1 21 0 71980K 5248K select 4 288:50 3.27% sshd
1853 root 1 20 0 29484K 2788K nanslp 0 3:13 0.00% gstat
1803 root 1 20 0 35476K 2128K nanslp 1 2:44 0.00% zpool
1798 root 1 20 0 16560K 2240K CPU0 7 1:07 0.00% top
1780 root 1 20 0 67884K 1792K select 2 0:23 0.00% sshd
1800 root 1 20 0 12052K 1484K select 6 0:17 0.00% script
1747 root 1 20 0 71980K 1868K select 1 0:13 0.00% sshd
3148 root 1 20 -20 21140K 8956K pause 7 0:11 0.00% atop
1850 root 1 20 0 12052K 1412K select 4 0:06 0.00% script
1784 root 1 20 0 67884K 1772K select 7 0:05 0.00% sshd
1652 nagios 1 20 0 12012K 1044K select 7 0:02 0.00% nrpe2
1795 root 1 20 0 12052K 1408K select 1 0:02 0.00% script
1538 root 1 20 0 11996K 960K nanslp 1 0:01 0.00% ipmon
1670 root 1 20 0 20272K 1876K select 1 0:01 0.00% sendmail
1677 root 1 20 0 14128K 1548K nanslp 2 0:00 0.00% cron
1547 root 1 20 0 12052K 1172K select 5 0:00 0.00% syslogd
---
The last output of zpool iostat -v 1
capacity operations bandwidth
pool alloc free read write read write
-------------- ----- ----- ----- ----- ----- -----
tank 1.19T 63.8T 95 0 360K 0
raidz2 305G 16.0T 25 0 92.2K 0
gpt/disk3 - - 16 0 8.47K 0
gpt/disk9 - - 17 0 18.9K 0
gpt/disk15 - - 12 0 6.98K 0
gpt/disk19 - - 12 0 6.48K 0
gpt/disk23 - - 21 0 14.0K 0
gpt/disk27 - - 18 0 10.5K 0
gpt/disk31 - - 18 0 9.47K 0
gpt/disk36 - - 16 0 18.4K 0
gpt/disk33 - - 12 0 15.5K 0
raidz2 305G 16.0T 25 0 103K 0
gpt/disk1 - - 16 0 8.47K 0
gpt/disk4 - - 24 0 16.0K 0
gpt/disk7 - - 17 0 10.5K 0
gpt/disk10 - - 17 0 8.97K 0
gpt/disk13 - - 25 0 15.5K 0
gpt/disk16 - - 15 0 8.97K 0
gpt/disk24 - - 15 0 7.98K 0
gpt/disk32 - - 25 0 16.9K 0
gpt/disk37 - - 16 0 9.47K 0
raidz2 305G 16.0T 20 0 81.3K 0
gpt/disk2 - - 9 0 4.98K 0
gpt/disk5 - - 20 0 14.0K 0
gpt/disk8 - - 18 0 10.5K 0
gpt/disk11 - - 18 0 9.47K 0
gpt/disk17 - - 20 0 11.5K 0
gpt/disk21 - - 12 0 6.48K 0
gpt/disk25 - - 12 0 6.48K 0
gpt/disk29 - - 20 0 13.0K 0
gpt/disk38 - - 9 0 4.98K 0
raidz2 305G 16.0T 22 0 83.7K 0
gpt/disk12 - - 15 0 7.98K 0
gpt/disk14 - - 18 0 19.4K 0
gpt/disk18 - - 14 0 16.0K 0
gpt/disk22 - - 15 0 7.98K 0
gpt/disk26 - - 19 0 13.0K 0
gpt/disk30 - - 10 0 5.98K 0
gpt/disk34 - - 10 0 5.48K 0
gpt/disk35 - - 18 0 17.9K 0
gpt/disk39 - - 15 0 7.98K 0
-------------- ----- ----- ----- ----- ----- -----
zroot 2.67G 925G 0 0 0 0
mirror 2.67G 925G 0 0 0 0
gpt/disk0 - - 0 0 0 0
gpt/disk6 - - 0 0 0 0
-------------- ----- ----- ----- ----- ----- -----
and the last output of gstat -I500ms
[gstat file]
dT: 0.503s w: 0.500s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
0 0 0 0 0.0 0 0 0.0 0.0| da0
0 10 10 5 4.6 0 0 0.0 4.6| da1
0 34 34 18 0.1 0 0 0.0 0.4| da2
0 8 8 6 0.2 0 0 0.0 0.1| da3
0 0 0 0 0.0 0 0 0.0 0.0| da0p1
0 0 0 0 0.0 0 0 0.0 0.0| da0p2
0 10 10 5 4.7 0 0 0.0 4.6| da1p1
0 34 34 18 0.2 0 0 0.0 0.5| da2p1
0 8 8 6 0.2 0 0 0.0 0.2| da3p1
0 18 18 12 0.2 0 0 0.0 0.2| da4
0 52 52 31 0.5 0 0 0.0 2.5| da5
0 0 0 0 0.0 0 0 0.0 0.0| da6
0 12 12 8 3.3 0 0 0.0 3.9| da7
0 32 32 17 0.1 0 0 0.0 0.4| da8
0 10 10 7 0.2 0 0 0.0 0.1| da9
0 12 12 7 0.1 0 0 0.0 0.2| da10
0 32 32 16 0.1 0 0 0.0 0.4| da11
0 42 42 22 0.1 0 0 0.0 0.5| da12
0 18 18 12 0.2 0 0 0.0 0.2| da13
0 62 62 38 0.1 0 0 0.0 0.8| da14
0 6 6 4 0.2 0 0 0.0 0.1| da15
0 14 14 8 0.2 0 0 0.0 0.2| da16
0 52 52 32 0.3 0 0 0.0 1.4| da17
0 40 40 21 0.1 0 0 0.0 0.5| da18
0 6 6 4 0.1 0 0 0.0 0.1| da19
0 0 0 0 0.0 0 0 0.0 0.0| da20
0 38 38 21 1.3 0 0 0.0 5.1| da21
0 40 40 20 0.1 0 0 0.0 0.5| da22
0 10 10 7 0.1 0 0 0.0 0.1| da23
0 14 14 8 3.4 0 0 0.0 4.7| da24
0 38 38 20 1.5 0 0 0.0 5.8| da25
0 62 62 39 0.1 0 0 0.0 0.8| da26
0 6 6 4 0.2 0 0 0.0 0.1| da27
0 0 0 0 0.0 0 0 0.0 0.0| da28
0 52 52 4 0.2 0 0 0.0 0.1| da29
0 70 70 36 0.1 0 0 0.0 0.9| da30
0 38 38 19 0.1 0 0 0.0 0.5| da31
0 0 0 0 0.0 0 0 0.0 0.0| da32
0 40 40 20 1.1 0 0 0.0 4.5| da33
0 70 70 35 0.1 0 0 0.0 0.9| da34
0 87 87 51 0.6 0 0 0.0 4.9| da35
0 54 54 32 0.1 0 0 0.0 0.7| da36
0 0 0 0 0.0 0 0 0.0 0.0| da37
0 8 8 4 18.8 0 0 0.0 3.8| da38
0 56 56 28 0.1 0 0 0.0 0.7| da39
[...]
---
> Next: are you using compression or dedup on any of your filesystems?
> If not, have you ever in the past?
No, this pool was build from scratch without any compression or dedup.
> Next: could we have your loader.conf and sysctl.conf please?
loader.conf
zfs_load="YES"
vfs.root.mountfrom="zfs:zroot"
console=comconsole
sysctl.conf is empty
> If you could put a swap disk on a dedicated controller (and no other
> disks on it), that would be ideal. Please do not use USB for this
> task
> (the USB stack may introduce its own set of complexities pertaining
> to
> interrupt usage).
I can't easily do this in the current setup. I would have to recreate the primary pool differently.
Thanks again,
Pascal
[-- Attachment #2 --]
smartctl 6.1 2013-03-16 r3800 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K1000.C
Device Model: Hitachi HDS721010CLA330
Serial Number: JP2940N118VSMV
LU WWN Device Id: 5 000cca 39cd21eef
Firmware Version: JP4OA3MA
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Tue Jun 4 13:29:22 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Disabled
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, NOT FROZEN [SEC1]
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 9812) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 164) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 095 095 016 - 65550
2 Throughput_Performance P-S--- 134 134 054 - 100
3 Spin_Up_Time POS--- 124 124 024 - 305 (Average 307)
4 Start_Stop_Count -O--C- 100 100 000 - 15
5 Reallocated_Sector_Ct PO--CK 100 100 005 - 0
7 Seek_Error_Rate PO-R-- 100 100 067 - 0
8 Seek_Time_Performance P-S--- 138 138 020 - 31
9 Power_On_Hours -O--C- 100 100 000 - 537
10 Spin_Retry_Count PO--C- 100 100 060 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 15
192 Power-Off_Retract_Count -O--CK 100 100 000 - 25
193 Load_Cycle_Count -O--C- 100 100 000 - 25
194 Temperature_Celsius -O---- 200 200 000 - 30 (Min/Max 24/36)
196 Reallocated_Event_Count -O--CK 100 100 000 - 0
197 Current_Pending_Sector -O---K 100 100 000 - 0
198 Offline_Uncorrectable ---R-- 100 100 000 - 0
199 UDMA_CRC_Error_Count -O-R-- 200 200 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL R/O 7 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x20 GPL R/O 1 Streaming performance log [OBS-8]
0x21 GPL R/O 1 Write stream error log
0x22 GPL R/O 1 Read stream error log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log Version: 0 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 256 (0x0100)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 29 Celsius
Power Cycle Min/Max Temperature: 29/31 Celsius
Lifetime Min/Max Temperature: 24/36 Celsius
Under/Over Temperature Limit Count: 0/0
SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -40/70 Celsius
Temperature History Size (Index): 128 (101)
Index Estimated Time Temperature Celsius
102 2013-06-04 11:22 29 **********
... ..( 22 skipped). .. **********
125 2013-06-04 11:45 29 **********
126 2013-06-04 11:46 30 ***********
127 2013-06-04 11:47 30 ***********
0 2013-06-04 11:48 30 ***********
1 2013-06-04 11:49 29 **********
2 2013-06-04 11:50 30 ***********
3 2013-06-04 11:51 29 **********
... ..( 6 skipped). .. **********
10 2013-06-04 11:58 29 **********
11 2013-06-04 11:59 30 ***********
12 2013-06-04 12:00 29 **********
13 2013-06-04 12:01 29 **********
14 2013-06-04 12:02 30 ***********
15 2013-06-04 12:03 29 **********
... ..( 2 skipped). .. **********
18 2013-06-04 12:06 29 **********
19 2013-06-04 12:07 30 ***********
20 2013-06-04 12:08 29 **********
21 2013-06-04 12:09 30 ***********
22 2013-06-04 12:10 29 **********
... ..( 2 skipped). .. **********
25 2013-06-04 12:13 29 **********
26 2013-06-04 12:14 30 ***********
27 2013-06-04 12:15 30 ***********
28 2013-06-04 12:16 29 **********
... ..( 34 skipped). .. **********
63 2013-06-04 12:51 29 **********
64 2013-06-04 12:52 30 ***********
65 2013-06-04 12:53 29 **********
... ..( 8 skipped). .. **********
74 2013-06-04 13:02 29 **********
75 2013-06-04 13:03 30 ***********
76 2013-06-04 13:04 29 **********
... ..( 7 skipped). .. **********
84 2013-06-04 13:12 29 **********
85 2013-06-04 13:13 30 ***********
86 2013-06-04 13:14 29 **********
... ..( 6 skipped). .. **********
93 2013-06-04 13:21 29 **********
94 2013-06-04 13:22 30 ***********
95 2013-06-04 13:23 30 ***********
96 2013-06-04 13:24 30 ***********
97 2013-06-04 13:25 29 **********
98 2013-06-04 13:26 29 **********
99 2013-06-04 13:27 29 **********
100 2013-06-04 13:28 30 ***********
101 2013-06-04 13:29 30 ***********
SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 15 Lifetime Power-On Resets
1 0x010 4 537 Power-on Hours
1 0x018 6 163693813 Logical Sectors Written
1 0x020 6 10142725 Number of Write Commands
1 0x028 6 20963385127 Logical Sectors Read
1 0x030 6 781982 Number of Read Commands
3 ===== = = == Rotating Media Statistics (rev 1) ==
3 0x008 4 537 Spindle Motor Power-on Hours
3 0x010 4 537 Head Flying Hours
3 0x018 4 25 Head Load Events
3 0x020 4 0 Number of Reallocated Logical Sectors
3 0x028 4 0 Read Recovery Attempts
3 0x030 4 4294967295 Number of Mechanical Start Failures
4 ===== = = == General Errors Statistics (rev 1) ==
4 0x008 4 0 Number of Reported Uncorrectable Errors
4 0x010 4 1 Resets Between Cmd Acceptance and Completion
5 ===== = = == Temperature Statistics (rev 1) ==
5 0x008 1 30 Current Temperature
5 0x010 1 29~ Average Short Term Temperature
5 0x018 1 -~ Average Long Term Temperature
5 0x020 1 36 Highest Temperature
5 0x028 1 24 Lowest Temperature
5 0x030 1 34~ Highest Average Short Term Temperature
5 0x038 1 0~ Lowest Average Short Term Temperature
5 0x040 1 -~ Highest Average Long Term Temperature
5 0x048 1 -~ Lowest Average Long Term Temperature
5 0x050 4 0 Time in Over-Temperature
5 0x058 1 60 Specified Maximum Operating Temperature
5 0x060 4 0 Time in Under-Temperature
5 0x068 1 0 Specified Minimum Operating Temperature
6 ===== = = == Transport Statistics (rev 1) ==
6 0x008 4 53 Number of Hardware Resets
6 0x010 4 53 Number of ASR Events
6 0x018 4 0 Number of Interface CRC Errors
|_ ~ normalized value
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0009 2 5 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 5 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS
[-- Attachment #3 --]
smartctl 6.1 2013-03-16 r3800 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 5K3000
Device Model: Hitachi HDS5C3020ALA632
Serial Number: ML2220F3349STE
LU WWN Device Id: 5 000cca 369ec3ca9
Firmware Version: ML6OA800
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5940 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is: Tue Jun 4 13:28:24 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Disabled
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, NOT FROZEN [SEC1]
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (22117) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 369) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 016 - 0
2 Throughput_Performance P-S--- 135 135 054 - 97
3 Spin_Up_Time POS--- 136 136 024 - 403 (Average 402)
4 Start_Stop_Count -O--C- 100 100 000 - 61
5 Reallocated_Sector_Ct PO--CK 100 100 005 - 0
7 Seek_Error_Rate PO-R-- 100 100 067 - 0
8 Seek_Time_Performance P-S--- 146 146 020 - 29
9 Power_On_Hours -O--C- 100 100 000 - 1686
10 Spin_Retry_Count PO--C- 100 100 060 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 61
192 Power-Off_Retract_Count -O--CK 100 100 000 - 109
193 Load_Cycle_Count -O--C- 100 100 000 - 109
194 Temperature_Celsius -O---- 193 193 000 - 31 (Min/Max 21/39)
196 Reallocated_Event_Count -O--CK 100 100 000 - 0
197 Current_Pending_Sector -O---K 100 100 000 - 0
198 Offline_Uncorrectable ---R-- 100 100 000 - 0
199 UDMA_CRC_Error_Count -O-R-- 200 200 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL R/O 7 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x08 GPL R/O 2 Power Conditions log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x20 GPL R/O 1 Streaming performance log [OBS-8]
0x21 GPL R/O 1 Write stream error log
0x22 GPL R/O 1 Read stream error log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 256 (0x0100)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 31 Celsius
Power Cycle Min/Max Temperature: 31/33 Celsius
Lifetime Min/Max Temperature: 21/39 Celsius
Under/Over Temperature Limit Count: 0/0
SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -40/70 Celsius
Temperature History Size (Index): 128 (93)
Index Estimated Time Temperature Celsius
94 2013-06-04 11:21 32 *************
... ..( 30 skipped). .. *************
125 2013-06-04 11:52 32 *************
126 2013-06-04 11:53 31 ************
... ..( 94 skipped). .. ************
93 2013-06-04 13:28 31 ************
SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 61 Lifetime Power-On Resets
1 0x010 4 1686 Power-on Hours
1 0x018 6 3079958370 Logical Sectors Written
1 0x020 6 45300814 Number of Write Commands
1 0x028 6 274975077 Logical Sectors Read
1 0x030 6 6599624 Number of Read Commands
3 ===== = = == Rotating Media Statistics (rev 1) ==
3 0x008 4 1684 Spindle Motor Power-on Hours
3 0x010 4 1684 Head Flying Hours
3 0x018 4 109 Head Load Events
3 0x020 4 0 Number of Reallocated Logical Sectors
3 0x028 4 15 Read Recovery Attempts
3 0x030 4 7 Number of Mechanical Start Failures
4 ===== = = == General Errors Statistics (rev 1) ==
4 0x008 4 0 Number of Reported Uncorrectable Errors
4 0x010 4 0 Resets Between Cmd Acceptance and Completion
5 ===== = = == Temperature Statistics (rev 1) ==
5 0x008 1 31 Current Temperature
5 0x010 1 31~ Average Short Term Temperature
5 0x018 1 30~ Average Long Term Temperature
5 0x020 1 39 Highest Temperature
5 0x028 1 21 Lowest Temperature
5 0x030 1 35~ Highest Average Short Term Temperature
5 0x038 1 25~ Lowest Average Short Term Temperature
5 0x040 1 33~ Highest Average Long Term Temperature
5 0x048 1 25~ Lowest Average Long Term Temperature
5 0x050 4 0 Time in Over-Temperature
5 0x058 1 60 Specified Maximum Operating Temperature
5 0x060 4 0 Time in Under-Temperature
5 0x068 1 0 Specified Minimum Operating Temperature
6 ===== = = == Transport Statistics (rev 1) ==
6 0x008 4 5127 Number of Hardware Resets
6 0x010 4 375 Number of ASR Events
6 0x018 4 0 Number of Interface CRC Errors
|_ ~ normalized value
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0009 2 11 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 11 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS
[-- Attachment #4 --]
smartctl 6.1 2013-03-16 r3800 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 5K3000
Device Model: Hitachi HDS5C3020ALA632
Serial Number: ML4220F3187SMK
LU WWN Device Id: 5 000cca 369d1d79c
Firmware Version: ML6OA580
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5940 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is: Tue Jun 4 13:29:30 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Disabled
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, NOT FROZEN [SEC1]
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (23985) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 400) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 016 - 0
2 Throughput_Performance P-S--- 131 131 054 - 112
3 Spin_Up_Time POS--- 135 135 024 - 407 (Average 407)
4 Start_Stop_Count -O--C- 100 100 000 - 84
5 Reallocated_Sector_Ct PO--CK 100 100 005 - 0
7 Seek_Error_Rate PO-R-- 100 100 067 - 0
8 Seek_Time_Performance P-S--- 148 148 020 - 28
9 Power_On_Hours -O--C- 100 100 000 - 2314
10 Spin_Retry_Count PO--C- 100 100 060 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 84
192 Power-Off_Retract_Count -O--CK 100 100 000 - 89
193 Load_Cycle_Count -O--C- 100 100 000 - 89
194 Temperature_Celsius -O---- 200 200 000 - 30 (Min/Max 17/36)
196 Reallocated_Event_Count -O--CK 100 100 000 - 0
197 Current_Pending_Sector -O---K 100 100 000 - 0
198 Offline_Uncorrectable ---R-- 100 100 000 - 0
199 UDMA_CRC_Error_Count -O-R-- 200 200 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL R/O 7 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x08 GPL R/O 1 Power Conditions log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x20 GPL R/O 1 Streaming performance log [OBS-8]
0x21 GPL R/O 1 Write stream error log
0x22 GPL R/O 1 Read stream error log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 256 (0x0100)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 30 Celsius
Power Cycle Min/Max Temperature: 29/31 Celsius
Lifetime Min/Max Temperature: 17/36 Celsius
Under/Over Temperature Limit Count: 0/0
SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -40/70 Celsius
Temperature History Size (Index): 128 (120)
Index Estimated Time Temperature Celsius
121 2013-06-04 11:22 31 ************
... ..( 64 skipped). .. ************
58 2013-06-04 12:27 31 ************
59 2013-06-04 12:28 30 ***********
... ..( 60 skipped). .. ***********
120 2013-06-04 13:29 30 ***********
SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 84 Lifetime Power-On Resets
1 0x010 4 2314 Power-on Hours
1 0x018 6 4301742943 Logical Sectors Written
1 0x020 6 51429622 Number of Write Commands
1 0x028 6 249696692048 Logical Sectors Read
1 0x030 6 8337099 Number of Read Commands
3 ===== = = == Rotating Media Statistics (rev 1) ==
3 0x008 4 2311 Spindle Motor Power-on Hours
3 0x010 4 2311 Head Flying Hours
3 0x018 4 89 Head Load Events
3 0x020 4 0 Number of Reallocated Logical Sectors
3 0x028 4 11 Read Recovery Attempts
3 0x030 4 7 Number of Mechanical Start Failures
4 ===== = = == General Errors Statistics (rev 1) ==
4 0x008 4 0 Number of Reported Uncorrectable Errors
4 0x010 4 0 Resets Between Cmd Acceptance and Completion
5 ===== = = == Temperature Statistics (rev 1) ==
5 0x008 1 30 Current Temperature
5 0x010 1 30~ Average Short Term Temperature
5 0x018 1 29~ Average Long Term Temperature
5 0x020 1 36 Highest Temperature
5 0x028 1 17 Lowest Temperature
5 0x030 1 34~ Highest Average Short Term Temperature
5 0x038 1 20~ Lowest Average Short Term Temperature
5 0x040 1 31~ Highest Average Long Term Temperature
5 0x048 1 23~ Lowest Average Long Term Temperature
5 0x050 4 0 Time in Over-Temperature
5 0x058 1 60 Specified Maximum Operating Temperature
5 0x060 4 0 Time in Under-Temperature
5 0x068 1 0 Specified Minimum Operating Temperature
6 ===== = = == Transport Statistics (rev 1) ==
6 0x008 4 1354 Number of Hardware Resets
6 0x010 4 244 Number of ASR Events
6 0x018 4 0 Number of Interface CRC Errors
|_ ~ normalized value
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0009 2 5 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 5 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1290657146.201020.1370608667469.JavaMail.root>
