Date: Sun, 06 Apr 2014 02:35:57 +0100 From: Kaya Saman <kayasaman@gmail.com> To: kpneal@pobox.com Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>, Vusa Moyo <vusa@tuxsystems.co.za> Subject: Re: Device Removed by Administrator in ZPOOL? Message-ID: <5340AF7D.5000204@gmail.com> In-Reply-To: <20140406002849.GA14765@neutralgood.org> References: <53408FAB.8080202@gmail.com> <512A7865-CEFD-4BDA-A060-AE911BEDD5B7@tuxsystems.co.za> <53409BF1.6050001@gmail.com> <20140406002849.GA14765@neutralgood.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 04/06/2014 01:28 AM, kpneal@pobox.com wrote: > On Sun, Apr 06, 2014 at 01:12:33AM +0100, Kaya Saman wrote: >> Many thanks for the response! >> >> The server doesn't show any lights for "drive error" however, the blue >> read LED isn't coming on, on the drive in question (as removed from ZPOOL). >> >> I will have a look for LSI tools in @Ports and also see if the BIOS LSI >> hook comes up with anything. > Have you seen any other errors in your logs? Seems like if a drive fails > there should be some other error message reporting the errors that resulted > in ZFS marking the drive removed. What does 'dmesg' have to say? > > Once ZFS has stopped using the drive (for whatever reason) I wouldn't > expect you to see anything else happening on the drive. So the light not > coming on doesn't really tell us anything new. > > Also, aren't 'green' drives the kind that spin down and then have to spin > back up when a request comes in? I don't know what happens if a drive takes > "too long" to respond because it has spun down. I have no idea how FreeBSD > handles that, and I also don't know if ZFS adds anything to the equation. > Hopefully someone else here will clue me/us in. > Something interesting I've just read..... https://forums.freebsd.org/viewtopic.php?&t=4534 [quote] Very, very, very poor data throughput. Drive dropping off the controller. Running an array verify would bring the server to a grinding halt. Incompatibilities with riser cards used in 2U rackmount servers (didn't matter if it was the el-cheapo one that came with the case, or a Tyan one specifically for the motherboard). Lack of useable management tools for Linux/FreeBSD (the mega* tools are a joke compared to 3dm2 or even the BIOS config tool for 3Ware). [/quote] I wonder if that's the issue with my system, that the drive has literally "dropped off the controller"? >> On 04/06/2014 12:44 AM, Vusa Moyo wrote: >>> This is more than likely a failed drive. >>> >>> Have you physically looked at the server for orange lights which may help ID the failed drive?? >>> >>> There could also be tools to query the lsi hba. >>> >>> Sent from my iPad >>> >>>> On Apr 6, 2014, at 1:20 AM, Kaya Saman <kayasaman@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> I'm running FreeBSD 10.0 x64 on a Xeon E5 based system with 8GB RAM. >>>> >>>> >>>> Checking the ZPOOL status I saw one of my drives has been offlined... the exact error is this: >>>> >>>> # zpool status -v >>>> pool: ZPOOL_2 >>>> state: DEGRADED >>>> status: One or more devices has been removed by the administrator. >>>> Sufficient replicas exist for the pool to continue functioning in a >>>> degraded state. >>>> action: Online the device using 'zpool online' or replace the device with >>>> 'zpool replace'. >>>> scan: scrub repaired 0 in 9h3m with 0 errors on Sat Apr 5 03:46:55 2014 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> ZPOOL_2 DEGRADED 0 0 0 >>>> raidz2-0 DEGRADED 0 0 0 >>>> da0 ONLINE 0 0 0 >>>> 14870388343127772554 REMOVED 0 0 0 was /dev/da1 >>>> da2 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> >>>> >>>> I think this is due to a dead disk however, I'm not certain which is why I wanted to ask here as I didn't remove the drive at all..... rather then some kind of OS/ZFS error. >>>> >>>> >>>> The drives are 2TB WD Green drives all connected to an LSI HBA; everything is still under warranty so no big issue there and I have external backups too so I'm not really that worried, I'm just trying to work out what's going on. >>>> >>>> >>>> Are my suspicions correct or should I simply try to reboot the system and see if the drive comes back online?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5340AF7D.5000204>