From owner-freebsd-fs@FreeBSD.ORG Sun Apr 6 01:36:02 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 142C975C for ; Sun, 6 Apr 2014 01:36:02 +0000 (UTC) Received: from mail-wg0-x229.google.com (mail-wg0-x229.google.com [IPv6:2a00:1450:400c:c00::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9621539C for ; Sun, 6 Apr 2014 01:36:01 +0000 (UTC) Received: by mail-wg0-f41.google.com with SMTP id n12so5259636wgh.24 for ; Sat, 05 Apr 2014 18:35:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type; bh=sbv4MNhF84PKfMJcGxxqSDFRXNTlcitMkAT7OW/v7+4=; b=d5H8x6QJrtwCkPXR+j78Wcvkr/an0ZX3vM0zJ4lB4T3GrPuEBVZQCOWeltOXZmPFU+ SbY96h/YULO8Wbgf4+02vCaSYUL+A0CKZdFMow8nAApAZBlsrwTpa6YDPdhgMX0n0khi R8xkS/VVT5NZFHY2HUXV8Kji2MBR2tJrZUXdncva8to0swSDzjj4Go7OHZ9Pc+/ey8lJ hvXe7jUaSYBMfPmX65aXNrZEOe9jZPgYrgRdP0k1DR+ltEB3ovgJzZ2jq+YHMLYx3rQT GXD3IHLZUvdfogX4HT8QXzLJwIJ9gWYztVaBjAiCJjnH7AVFxaVo22l090DYWJOKYHhm 8zIA== X-Received: by 10.194.109.227 with SMTP id hv3mr32015270wjb.10.1396748159914; Sat, 05 Apr 2014 18:35:59 -0700 (PDT) Received: from [192.168.20.30] (81-178-2-118.dsl.pipex.com. [81.178.2.118]) by mx.google.com with ESMTPSA id fs16sm14123000wic.18.2014.04.05.18.35.58 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 05 Apr 2014 18:35:59 -0700 (PDT) Message-ID: <5340AF7D.5000204@gmail.com> Date: Sun, 06 Apr 2014 02:35:57 +0100 From: Kaya Saman User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: kpneal@pobox.com Subject: Re: Device Removed by Administrator in ZPOOL? References: <53408FAB.8080202@gmail.com> <512A7865-CEFD-4BDA-A060-AE911BEDD5B7@tuxsystems.co.za> <53409BF1.6050001@gmail.com> <20140406002849.GA14765@neutralgood.org> In-Reply-To: <20140406002849.GA14765@neutralgood.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: FreeBSD Filesystems , Vusa Moyo X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Apr 2014 01:36:02 -0000 On 04/06/2014 01:28 AM, kpneal@pobox.com wrote: > On Sun, Apr 06, 2014 at 01:12:33AM +0100, Kaya Saman wrote: >> Many thanks for the response! >> >> The server doesn't show any lights for "drive error" however, the blue >> read LED isn't coming on, on the drive in question (as removed from ZPOOL). >> >> I will have a look for LSI tools in @Ports and also see if the BIOS LSI >> hook comes up with anything. > Have you seen any other errors in your logs? Seems like if a drive fails > there should be some other error message reporting the errors that resulted > in ZFS marking the drive removed. What does 'dmesg' have to say? > > Once ZFS has stopped using the drive (for whatever reason) I wouldn't > expect you to see anything else happening on the drive. So the light not > coming on doesn't really tell us anything new. > > Also, aren't 'green' drives the kind that spin down and then have to spin > back up when a request comes in? I don't know what happens if a drive takes > "too long" to respond because it has spun down. I have no idea how FreeBSD > handles that, and I also don't know if ZFS adds anything to the equation. > Hopefully someone else here will clue me/us in. > Something interesting I've just read..... https://forums.freebsd.org/viewtopic.php?&t=4534 [quote] Very, very, very poor data throughput. Drive dropping off the controller. Running an array verify would bring the server to a grinding halt. Incompatibilities with riser cards used in 2U rackmount servers (didn't matter if it was the el-cheapo one that came with the case, or a Tyan one specifically for the motherboard). Lack of useable management tools for Linux/FreeBSD (the mega* tools are a joke compared to 3dm2 or even the BIOS config tool for 3Ware). [/quote] I wonder if that's the issue with my system, that the drive has literally "dropped off the controller"? >> On 04/06/2014 12:44 AM, Vusa Moyo wrote: >>> This is more than likely a failed drive. >>> >>> Have you physically looked at the server for orange lights which may help ID the failed drive?? >>> >>> There could also be tools to query the lsi hba. >>> >>> Sent from my iPad >>> >>>> On Apr 6, 2014, at 1:20 AM, Kaya Saman wrote: >>>> >>>> Hi, >>>> >>>> I'm running FreeBSD 10.0 x64 on a Xeon E5 based system with 8GB RAM. >>>> >>>> >>>> Checking the ZPOOL status I saw one of my drives has been offlined... the exact error is this: >>>> >>>> # zpool status -v >>>> pool: ZPOOL_2 >>>> state: DEGRADED >>>> status: One or more devices has been removed by the administrator. >>>> Sufficient replicas exist for the pool to continue functioning in a >>>> degraded state. >>>> action: Online the device using 'zpool online' or replace the device with >>>> 'zpool replace'. >>>> scan: scrub repaired 0 in 9h3m with 0 errors on Sat Apr 5 03:46:55 2014 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> ZPOOL_2 DEGRADED 0 0 0 >>>> raidz2-0 DEGRADED 0 0 0 >>>> da0 ONLINE 0 0 0 >>>> 14870388343127772554 REMOVED 0 0 0 was /dev/da1 >>>> da2 ONLINE 0 0 0 >>>> da3 ONLINE 0 0 0 >>>> da4 ONLINE 0 0 0 >>>> >>>> >>>> I think this is due to a dead disk however, I'm not certain which is why I wanted to ask here as I didn't remove the drive at all..... rather then some kind of OS/ZFS error. >>>> >>>> >>>> The drives are 2TB WD Green drives all connected to an LSI HBA; everything is still under warranty so no big issue there and I have external backups too so I'm not really that worried, I'm just trying to work out what's going on. >>>> >>>> >>>> Are my suspicions correct or should I simply try to reboot the system and see if the drive comes back online?