Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jan 2012 10:56:25 +0000
From:      Pete French <petefrench@ingresso.co.uk>
To:        freebsd-stable@freebsd.org
Subject:   Odd zpool problem - always one disc offline, maybe controller related ?
Message-ID:  <E1RkZNR-0002xZ-IN@dilbert.ingresso.co.uk>

next in thread | raw e-mail | index | archive | help
I upgraded my system to -stable on January 6th, and since then I
have noticed a very odd problem. I have a zpool with 4 drives in it,
and one of them is always 'OFFLINE' - if I put it online and it
styarts resolvering then another one immediately goes offline.

It's the same two drives alternating as well - very perplexing. I
have checked all the cabling (they are eSATA drives), and it is all
pushed home solid. It looks from dmesg like the drive is
disconnecting and reconnecting briefly, but thats triggering it being
dropped out of the zpool.

I must admit that though I noticed thos on the 6th, I cant tell you
whhether it was working on the version I was runnign previously, as
I dont check the zpool on that machine as ofetn as I shiuld. Am recompiling
an earlier version now though to see.

Details of what happens are below:

-pete.

------

[pete@skerry ~]$ zpool status
  pool: cube
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 6.41G in 2h27m with 0 errors on Mon Jan  9 23:23:27 2012
config:

        NAME                     STATE     READ WRITE CKSUM
        cube                     DEGRADED     0     0     0
          mirror-0               ONLINE       0     0     0
            ada2                 ONLINE       0     0     0
            ada3                 ONLINE       0     0     0
          mirror-1               DEGRADED     0     0     0
            ada1                 ONLINE       0     0     0
            8890308235385361660  REMOVED      0     0     0  was /dev/ada0

errors: No known data errors
[pete@skerry ~]$ su
Password:
skerry# zpool online ada0
missing device name
usage:
        online [-e] <pool> <device> ...
skerry# zpool online cube ada0
skerry# zpool status
  pool: cube
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jan 10 09:03:58 2012
        1.02G scanned out of 1.42T at 80.6M/s, 5h8m to go
        492M resilvered, 0.07% done
config:

        NAME                     STATE     READ WRITE CKSUM
        cube                     DEGRADED     0     0     0
          mirror-0               DEGRADED     0     0     0
            ada2                 ONLINE       0     0     0
            6739201713000599902  REMOVED      0     0     0  was /dev/ada3
          mirror-1               ONLINE       0     0     0
            ada1                 ONLINE       0     0     0
            ada0                 ONLINE       0     0     0  (resilvering)

errors: No known data errors
skerry# 

...and from dmesg at the point I did that:

(ada3:siisch3:0:0:0): lost device
(ada3:siisch3:0:0:0): removing device entry
ada3 at siisch3 bus 0 scbus3 target 0 lun 0
ada3: <WDC WD1002FBYS-02A6B0 03.00C06> ATA-8 SATA 2.x device
ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)

here is the boot dmesg:


Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.2-STABLE #0: Fri Jan  6 12:41:32 GMT 2012
    pete@skerry.drayhouse:/usr/obj/usr/src/sys/GENERIC amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz (2992.52-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x10676  Family = 6  Model = 17  Stepping = 6
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x8e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant
real memory  = 4299161600 (4100 MB)
avail memory = 4024582144 (3838 MB)
ACPI APIC Table: <COMPAQ BEARLAKE>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 1
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <HPQOEM SLIC-BPC> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, dff00000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0xf808-0xf80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
siis0: <SiI3124 SATA controller> port 0x3100-0x310f mem 0xf0308000-0xf030807f,0xf0300000-0xf0307fff irq 16 at device 0.0 on pci2
siis0: [ITHREAD]
siisch0: <SIIS channel> at channel 0 on siis0
siisch0: [ITHREAD]
siisch1: <SIIS channel> at channel 1 on siis0
siisch1: [ITHREAD]
siisch2: <SIIS channel> at channel 2 on siis0
siisch2: [ITHREAD]
siisch3: <SIIS channel> at channel 3 on siis0
siisch3: [ITHREAD]
vgapci0: <VGA-compatible display> port 0x4240-0x4247 mem 0xf0100000-0xf017ffff,0xe0000000-0xefffffff,0xf0000000-0xf00fffff irq 16 at device 2.0 on pci0
agp0: <Intel Q33 SVGA controller> on vgapci0
agp0: aperture size is 256M, detected 6140k stolen memory
pci0: <simple comms> at device 3.0 (no driver attached)
em0: <Intel(R) PRO/1000 Network Connection 7.2.3> port 0x4100-0x411f mem 0xf0180000-0xf019ffff,0xf01a4000-0xf01a4fff irq 19 at device 25.0 on pci0
em0: Using an MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:1f:29:d3:51:be
uhci0: <Intel 82801I (ICH9) USB controller> port 0x4120-0x413f irq 20 at device 26.0 on pci0
uhci0: [ITHREAD]
usbus0: <Intel 82801I (ICH9) USB controller> on uhci0
uhci1: <Intel 82801I (ICH9) USB controller> port 0x4140-0x415f irq 21 at device 26.1 on pci0
uhci1: [ITHREAD]
usbus1: <Intel 82801I (ICH9) USB controller> on uhci1
uhci2: <Intel 82801I (ICH9) USB controller> port 0x4160-0x417f irq 22 at device 26.2 on pci0
uhci2: [ITHREAD]
usbus2: <Intel 82801I (ICH9) USB controller> on uhci2
ehci0: <Intel 82801I (ICH9) USB 2.0 controller> mem 0xf01a5000-0xf01a53ff irq 22 at device 26.7 on pci0
ehci0: [ITHREAD]
usbus3: EHCI version 1.0
usbus3: <Intel 82801I (ICH9) USB 2.0 controller> on ehci0
pci0: <multimedia, HDA> at device 27.0 (no driver attached)
pcib3: <ACPI PCI-PCI bridge> irq 20 at device 28.0 on pci0
pci32: <ACPI PCI bus> on pcib3
siis1: <SiI3132 SATA controller> port 0x1100-0x117f mem 0xf0404000-0xf040407f,0xf0400000-0xf0403fff irq 16 at device 0.0 on pci32
siis1: [ITHREAD]
siisch4: <SIIS channel> at channel 0 on siis1
siisch4: [ITHREAD]
siisch5: <SIIS channel> at channel 1 on siis1
siisch5: [ITHREAD]
pcib4: <ACPI PCI-PCI bridge> irq 21 at device 28.1 on pci0
pci48: <ACPI PCI bus> on pcib4
uhci3: <Intel 82801I (ICH9) USB controller> port 0x4180-0x419f irq 20 at device 29.0 on pci0
uhci3: [ITHREAD]
usbus4: <Intel 82801I (ICH9) USB controller> on uhci3
uhci4: <Intel 82801I (ICH9) USB controller> port 0x41a0-0x41bf irq 21 at device 29.1 on pci0
uhci4: [ITHREAD]
usbus5: <Intel 82801I (ICH9) USB controller> on uhci4
uhci5: <Intel 82801I (ICH9) USB controller> port 0x41c0-0x41df irq 22 at device 29.2 on pci0
uhci5: [ITHREAD]
usbus6: <Intel 82801I (ICH9) USB controller> on uhci5
ehci1: <Intel 82801I (ICH9) USB 2.0 controller> mem 0xf01a5400-0xf01a57ff irq 20 at device 29.7 on pci0
ehci1: [ITHREAD]
usbus7: EHCI version 1.0
usbus7: <Intel 82801I (ICH9) USB 2.0 controller> on ehci1
pcib5: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci7: <ACPI PCI bus> on pcib5
em1: <Intel(R) PRO/1000 Legacy Network Connection 1.0.3> port 0x2100-0x213f mem 0xf0200000-0xf021ffff irq 20 at device 4.0 on pci7
em1: [FILTER]
em1: Ethernet address: 00:07:e9:10:d8:86
em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.3> port 0x2140-0x217f mem 0xf0220000-0xf023ffff irq 21 at device 4.1 on pci7
em2: [FILTER]
em2: Ethernet address: 00:07:e9:10:d8:87
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH9 SATA300 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x4200-0x420f,0x4210-0x421f irq 18 at device 31.2 on pci0
ata0: <ATA channel> at channel 0 on atapci0
ata0: [ITHREAD]
ata1: <ATA channel> at channel 1 on atapci0
ata1: [ITHREAD]
atapci1: <Intel ICH9 SATA300 controller> port 0x4258-0x425f,0x4270-0x4273,0x4260-0x4267,0x4274-0x4277,0x4220-0x422f,0x4230-0x423f irq 18 at device 31.5 on pci0
atapci1: [ITHREAD]
ata2: <ATA channel> at channel 0 on atapci1
ata2: [ITHREAD]
ata3: <ATA channel> at channel 1 on atapci1
ata3: [ITHREAD]
acpi_button0: <Power Button> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
acpi_hpet1: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
device_attach: acpi_hpet1 attach returned 12
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
(noperiph:siisch0:0:-1:-1): rescan already queued
(noperiph:siisch1:0:-1:-1): rescan already queued
(noperiph:siisch2:0:-1:-1): rescan already queued
(noperiph:siisch3:0:-1:-1): rescan already queued
(noperiph:siisch4:0:-1:-1): rescan already queued
ZFS filesystem version 5
ZFS storage pool version 28
Timecounters tick every 1.000 msec
vboxdrv: fAsync=0 offMin=0x168 offMax=0x40b
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 12Mbps Full Speed USB v1.0
usbus2: 12Mbps Full Speed USB v1.0
usbus3: 480Mbps High Speed USB v2.0
usbus4: 12Mbps Full Speed USB v1.0
usbus5: 12Mbps Full Speed USB v1.0
usbus6: 12Mbps Full Speed USB v1.0
usbus7: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen2.1: <Intel> at usbus2
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen3.1: <Intel> at usbus3
uhub3: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3
ugen4.1: <Intel> at usbus4
uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
ugen5.1: <Intel> at usbus5
uhub5: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
ugen6.1: <Intel> at usbus6
uhub6: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus6
ugen7.1: <Intel> at usbus7
uhub7: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus7
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub4: 2 ports with 2 removable, self powered
uhub5: 2 ports with 2 removable, self powered
uhub6: 2 ports with 2 removable, self powered
acd0: DVDR <HL-DT-ST DVD-RAM GSA-H60L/R90C> at ata1-master UDMA100 SATA 1.5Gb/s
uhub3: 6 ports with 6 removable, self powered
uhub7: 6 ports with 6 removable, self powered
ugen7.2: <Generic> at usbus7
umass0: <Bulk-In, Bulk-Out, Interface> on usbus7
umass0:  SCSI over Bulk-Only; quirks = 0x4000
umass0:7:0:-1: Attached to scbus7
acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x40 0x00 0x01
(probe1:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
(probe1:umass-sim0:0:0:0): CAM status: SCSI Status Error
(probe1:umass-sim0:0:0:0): SCSI status: Check Condition
(probe1:umass-sim0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
ugen1.2: <vendor 0x1241> at usbus1
ukbd0: <vendor 0x1241 USB Keyboard, class 0/0, rev 1.10/2.90, addr 2> on usbus1
kbd2 at ukbd0
uhid0: <vendor 0x1241 USB Keyboard, class 0/0, rev 1.10/2.90, addr 2> on usbus1
(probe0:umass-sim0:0:0:1): TEST UNIT READY. CDB: 0 20 0 0 0 0 
(probe0:umass-sim0:0:0:1): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:1): SCSI status: Check Condition
(probe0:umass-sim0:0:0:1): SCSI sense: NOT READY asc:3a,0 (Medium not present)
(probe0:umass-sim0:0:0:2): TEST UNIT READY. CDB: 0 40 0 0 0 0 
(probe0:umass-sim0:0:0:2): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:2): SCSI status: Check Condition
(probe0:umass-sim0:0:0:2): SCSI sense: NOT READY asc:3a,0 (Medium not present)
(probe0:umass-sim0:0:0:3): TEST UNIT READY. CDB: 0 60 0 0 0 0 
(probe0:umass-sim0:0:0:3): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:3): SCSI status: Check Condition
(probe0:umass-sim0:0:0:3): SCSI sense: NOT READY asc:3a,0 (Medium not present)
ada0 at siisch0 bus 0 scbus0 target 0 lun 0
ada0: <WDC WD1002FBYS-02A6B0 03.00C06> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1 at siisch1 bus 0 scbus1 target 0 lun 0
ada1: <WDC WD1002FBYS-02A6B0 03.00C06> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada2 at siisch2 bus 0 scbus2 target 0 lun 0
ada2: <WDC WD1002FBYS-02A6B0 03.00C06> ATA-8 SATA 2.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada3 at siisch3 bus 0 scbus3 target 0 lun 0
ada3: <WDC WD1002FBYS-02A6B0 03.00C06> ATA-8 SATA 2.x device
ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada4 at siisch4 bus 0 scbus4 target 0 lun 0
ada4: <OCZ-ONYX 1.6> ATA-8 SATA 2.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes)
ada4: Command Queueing enabled
ada4: 30533MB (62533296 512 byte sectors: 16H 63S/T 16383C)
da0 at umass-sim0 bus 0 scbus7 target 0 lun 0
da0: <Generic- Compact Flash 1.00> Removable Direct Access SCSI-0 device 
da0: 40.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present
cd0 at ata1 bus 0 scbus6 target 0 lun 0
cd0: <HL-DT-ST DVD-RAM GSA-H60L R90C> Removable CD-ROM SCSI-0 device 
cd0: 100.000MB/s transfers
cd0: cd present [3008 x 2048 byte records]
da1 at umass-sim0 bus 0 scbus7 target 0 lun 1
da1: <Generic- SM/xD-Picture 1.00> Removable Direct Access SCSI-0 device 
da1: 40.000MB/s transfers
da1: Attempt to query device size failed: NOT READY, Medium not presentSMP: AP CPU #1 Launched!

da2 at umass-sim0 bus 0 scbus7 target 0 lun 2
da2: <Generic- SD/MMC 1.00> Removable Direct Access SCSI-0 device 
da2: 40.000MB/s transfers
da2: Attempt to query device size failed: NOT READY, Medium not present
da3 at umass-sim0 bus 0 scbus7 target 0 lun 3
da3: <Generic- MS/MS-Pro 1.00> Removable Direct Access SCSI-0 device 
da3: 40.000MB/s transfers
da3: Attempt to query device size failed: NOT READY, Medium not present
Trying to mount root from ufs:/dev/gpt/skerry-root
Setting hostuuid: 0071dfa5-eaab-11df-88e2-02dc1053ff3a.
Setting hostid: 0xe54799ad.
Entropy harvesting:
 interrupts
 ethernet
 point_to_point
 kickstart
.
Starting file system checks:
/dev/gpt/skerry-root: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/gpt/skerry-root: clean, 7792879 free (65439 frags, 965930 blocks, 0.6% fragmentation)
Mounting local file systems:





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1RkZNR-0002xZ-IN>