Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Jul 2011 16:25:38 +0200
From:      Jerome Herman <jherman@dichotomia.fr>
To:        freebsd-stable@freebsd.org
Subject:   Re: Making world but no kernel
Message-ID:  <4E2ECE62.4050605@dichotomia.fr>
In-Reply-To: <20110726131655.GA88280@icarus.home.lan>
References:  <4E2E9F24.1040108@dichotomia.fr> <20110726114438.GA86683@icarus.home.lan> <4E2EB814.9040704@dichotomia.fr> <20110726131655.GA88280@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On 26/07/2011 15:16, Jeremy Chadwick wrote:
> On Tue, Jul 26, 2011 at 02:50:28PM +0200, Jerome Herman wrote:
>> [very large snip]
>>
>> So here I am starting to think that my disklabel and fsck are not in
>> sync with my kernel.
> I've never heard of either of these utilities (bsdlabel/disklabel, nor
> fsck) having to be "in sync with the kernel".  My opinion at this moment
> in time is that you're barking up the wrong tree.
Actually fsck and disklabel needed to be heavily modified in order to 
support gvinum fully, mainly because the underlying device is very 
different from a standard drive.

>
> As I'm not familiar with the vinum infrastructure, GEOM-based or
> without, others will have to assist with that.  However, I'm still not
> able to discern what your "type" of gvinum volume is -- is it a mirror,
> a stripe, or a raid5?
Actually it is Raid 10 of a sort. Three first halves of the three disk 
concatenated and mirrored on the three second half of the same drives.
>
> Others who are more familiar with vinum are probably going to ask you to
> provide the full configuration details of your vinum setup, including
> all the commands you issued to create it.  "gvinum printconfig" would be
> a great start.
Here is gvinum printconfig

drive c device /dev/ad7
drive b device /dev/ad6
drive a device /dev/ad5
volume backup
plex name backup.p1 org striped 1024s vol backup
plex name backup.p0 org striped 1024s vol backup
sd name backup.p1.s2 drive b len 1465137152s driveoffset 1465137417s 
plex backup.p1 plexoffset 2048s
sd name backup.p1.s1 drive a len 1465137152s driveoffset 1465137417s 
plex backup.p1 plexoffset 1024s
sd name backup.p1.s0 drive c len 1465137152s driveoffset 1465137417s 
plex backup.p1 plexoffset 0s
sd name backup.p0.s2 drive c len 1465137152s driveoffset 265s plex 
backup.p0 plexoffset 2048s
sd name backup.p0.s1 drive b len 1465137152s driveoffset 265s plex 
backup.p0 plexoffset 1024s
sd name backup.p0.s0 drive a len 1465137152s driveoffset 265s plex 
backup.p0 plexoffset 0s

By the way, I did the make buildworld, make installworld.

results :
a) it did reboot and started fine
b) it did reboot in 43 seconds (according to monitoring) instead of 
8+minutes.
c) fsck is now working fine, in under 10 minutes.

Boy I love when I do something completely stupid, and it works. (This is 
a test machine by the way, I would not do this in production)

>
> Furthermore, could you please provide the data I asked for with regards
> to your storage devices?  In this case, /dev/ad5, /dev/ad6, and /dev/ad7
> (assuming those are all which are on the system)?  Let's try to rule out
> ANY underlying disk issues first, otherwise the rest of the above may
> be wasted effort.
>
I completely agree with "removing underlying issues first", that is why 
when I realized that my base install was borked I went for the make 
installworld first.
The dmesg is very long (it holds about 12 reboots)

but for the rest :
*> /etc/fstab*
# Device                Mountpoint      FStype  Options         Dump    
Pass#
/dev/ad4s1a             /               ufs             rw      1       1
/dev/ad4s1b             none            swap            sw      0       0
/dev/ad4s1d             /var            ufs             rw      2       2
/dev/ad4s1e             /usr            ufs             rw      2       2
/dev/ad4s1f             /data           ufs             rw      2       2
/dev/gvinum/backup    /backup    ufs        rw    2    2
proc                    /proc           procfs  rw              0       0


*> sysctl kern.disks*
kern.disks: ad7 ad6 ad5 ad4

*> atacontrol list*
ATA channel 0:
     Master:      no device present
     Slave:       no device present
ATA channel 2:
     Master:  ad4 <ST31500341AS/CC1H> SATA revision 2.x
     Slave:   ad5 <ST31500341AS/CC1H> SATA revision 2.x
ATA channel 3:
     Master:  ad6 <ST31500341AS/CC1H> SATA revision 2.x
     Slave:   ad7 <ST31500341AS/CC1H> SATA revision 2.x


*> atacontrol cap ad5*
Protocol              SATA revision 2.x
device model          ST31500341AS
serial number         9VS4QNSC
firmware revision     CC1H
cylinders             16383
heads                 16
sectors/track         63
lba supported         268435455 sectors
lba48 supported       2930277168 sectors
dma supported
overlap not supported

Feature                      Support  Enable    Value           Vendor
write cache                    yes      yes
read ahead                     yes      yes
Native Command Queuing (NCQ)   yes       -      31/0x1F
Tagged Command Queuing (TCQ)   no       no      31/0x1F
SMART                          yes      yes
microcode download             yes      yes
security                       yes      no
power management               yes      yes
advanced power management      no       no      0/0x00
automatic acoustic management  yes      yes     254/0xFE        254/0xFE

ad6 and 7 are indentical except for serial number.



*> smartctl -a /dev/ad5*
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.11
Device Model:     ST31500341AS
Serial Number:    9VS4QNSC
LU WWN Device Id: 5 000c50 02d019b97
Firmware Version: CC1H
User Capacity:    1,500,301,910,016 bytes [1.50 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Tue Jul 26 14:21:07 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                         was completed without error.
                                         Auto Offline Data Collection: 
Enabled.
Self-test execution status:      (   0) The previous self-test routine 
completed
                                         without error or no self-test 
has ever
                                         been run.
Total time to complete Offline
data collection:                (  617) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                         Auto Offline data collection 
on/off support.
                                         Suspend Offline collection upon new
                                         command.
                                         Offline surface scan supported.
                                         Self-test supported.
                                         Conveyance Self-test supported.
                                         Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                         power-saving mode.
                                         Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                         General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x103f) SCT Status supported.
                                         SCT Error Recovery Control 
supported.
                                         SCT Feature Control supported.
                                         SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   115   099   006    Pre-fail  
Always       -       87140948
   3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  
Always       -       0
   4 Start_Stop_Count        0x0032   100   100   020    Old_age   
Always       -       24
   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  
Always       -       0
   7 Seek_Error_Rate         0x000f   074   060   030    Pre-fail  
Always       -       28683116
   9 Power_On_Hours          0x0032   094   094   000    Old_age   
Always       -       5418
  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   
Always       -       24
184 End-to-End_Error        0x0032   100   100   099    Old_age   
Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   
Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   
Always       -       1
189 High_Fly_Writes         0x003a   099   099   000    Old_age   
Always       -       1
190 Airflow_Temperature_Cel 0x0022   060   047   045    Old_age   
Always       -       40 (Min/Max 38/53)
194 Temperature_Celsius     0x0022   040   053   000    Old_age   
Always       -       40 (0 18 0 0)
195 Hardware_ECC_Recovered  0x001a   039   023   000    Old_age   
Always       -       87140948
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   
Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   
Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   
Offline      -       261580688201002
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   
Offline      -       3949374025
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   
Offline      -       4071332224

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      
5417         -

SMART Selective self-test log data structure revision number 1
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing
Selective self-test flags (0x0):
   After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


*>smartctl -a /dev/ad6*
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.11
Device Model:     ST31500341AS
Serial Number:    9VS4MXD5
LU WWN Device Id: 5 000c50 02cd319bf
Firmware Version: CC1H
User Capacity:    1,500,301,910,016 bytes [1.50 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Tue Jul 26 14:22:34 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                         was completed without error.
                                         Auto Offline Data Collection: 
Enabled.
Self-test execution status:      (   0) The previous self-test routine 
completed
                                         without error or no self-test 
has ever
                                         been run.
Total time to complete Offline
data collection:                (  609) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                         Auto Offline data collection 
on/off support.
                                         Suspend Offline collection upon new
                                         command.
                                         Offline surface scan supported.
                                         Self-test supported.
                                         Conveyance Self-test supported.
                                         Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                         power-saving mode.
                                         Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                         General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x103f) SCT Status supported.
                                         SCT Error Recovery Control 
supported.
                                         SCT Feature Control supported.
                                         SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   116   099   006    Pre-fail  
Always       -       107861899
   3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  
Always       -       0
   4 Start_Stop_Count        0x0032   100   100   020    Old_age   
Always       -       23
   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  
Always       -       0
   7 Seek_Error_Rate         0x000f   074   060   030    Pre-fail  
Always       -       26454013
   9 Power_On_Hours          0x0032   094   094   000    Old_age   
Always       -       5419
  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   
Always       -       23
184 End-to-End_Error        0x0032   100   100   099    Old_age   
Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   
Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   
Always       -       0
189 High_Fly_Writes         0x003a   096   096   000    Old_age   
Always       -       4
190 Airflow_Temperature_Cel 0x0022   056   044   045    Old_age   
Always   In_the_past 44 (0 14 56 39)
194 Temperature_Celsius     0x0022   044   056   000    Old_age   
Always       -       44 (0 18 0 0)
195 Hardware_ECC_Recovered  0x001a   057   029   000    Old_age   
Always       -       107861899
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   
Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   
Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   
Offline      -       161821482816811
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   
Offline      -       2546745907
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   
Offline      -       3981257233

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      
5417         -

SMART Selective self-test log data structure revision number 1
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing
Selective self-test flags (0x0):
   After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



*> smartctl -a /dev/ad7*
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.11
Device Model:     ST31500341AS
Serial Number:    9VS4FSDY
LU WWN Device Id: 5 000c50 0274ee0d7
Firmware Version: CC1H
User Capacity:    1,500,301,910,016 bytes [1.50 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Tue Jul 26 14:23:08 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                         was completed without error.
                                         Auto Offline Data Collection: 
Enabled.
Self-test execution status:      (   0) The previous self-test routine 
completed
                                         without error or no self-test 
has ever
                                         been run.
Total time to complete Offline
data collection:                (  617) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                         Auto Offline data collection 
on/off support.
                                         Suspend Offline collection upon new
                                         command.
                                         Offline surface scan supported.
                                         Self-test supported.
                                         Conveyance Self-test supported.
                                         Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                         power-saving mode.
                                         Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                         General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x103f) SCT Status supported.
                                         SCT Error Recovery Control 
supported.
                                         SCT Feature Control supported.
                                         SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   110   099   006    Pre-fail  
Always       -       26689916
   3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  
Always       -       0
   4 Start_Stop_Count        0x0032   100   100   020    Old_age   
Always       -       24
   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  
Always       -       8
   7 Seek_Error_Rate         0x000f   073   060   030    Pre-fail  
Always       -       22747051
   9 Power_On_Hours          0x0032   094   094   000    Old_age   
Always       -       5401
  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  
Always       -       0
  12 Power_Cycle_Count       0x0032   100   037   020    Old_age   
Always       -       24
184 End-to-End_Error        0x0032   100   100   099    Old_age   
Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   
Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   
Always       -       1
189 High_Fly_Writes         0x003a   100   100   000    Old_age   
Always       -       0
190 Airflow_Temperature_Cel 0x0022   058   045   045    Old_age   
Always   In_the_past 42 (Min/Max 35/55)
194 Temperature_Celsius     0x0022   042   055   000    Old_age   
Always       -       42 (0 18 0 0)
195 Hardware_ECC_Recovered  0x001a   041   028   000    Old_age   
Always       -       26689916
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   
Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   
Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   
Offline      -       18356690227381
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   
Offline      -       125910856
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   
Offline      -       1003871140

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      
5399         -

SMART Selective self-test log data structure revision number 1
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing
Selective self-test flags (0x0):
   After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E2ECE62.4050605>