From owner-freebsd-hackers@FreeBSD.ORG Fri Sep 12 16:16:10 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 96DB31065675 for ; Fri, 12 Sep 2008 16:16:10 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.237]) by mx1.freebsd.org (Postfix) with ESMTP id 4BCA18FC1F for ; Fri, 12 Sep 2008 16:16:10 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: by wr-out-0506.google.com with SMTP id c8so194511wra.27 for ; Fri, 12 Sep 2008 09:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=OARKhhHDvpDZpgm/lhoxUv6JyVurTtZInXRwkhHu+nI=; b=neqnCxyjoL9u64631UXNLFk1Kx6eVS+su/8GWT4ZFEFhuMFSM90HJ+gEgS85cIHT2N fWTrCQxlLJ4xShcZQJLyx4f0+qOzJrJPmQ6G4YyMixA21c4ft1Y4xnlTRf1dYuXJvZK/ 3indmwWpg0BjIQgotN82LD2nRbfwoxipcvuoE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=MQ6v03RSTW1ScxxrMrRT+nUat1/XALyCwcckERvLVKSzfQJSfnA5zPUVAPgBuige3Q OF1HsVFRiP9ofu775tlvhfAf+XxlCSIiOJbKF/KDXksLfqmhZSYloyohx3JIFeMB8GfA fN1gCHpA6ducekIY0lc9nInSNuq+W5Z8N2E2Q= Received: by 10.151.112.4 with SMTP id p4mr6295725ybm.103.1221234641107; Fri, 12 Sep 2008 08:50:41 -0700 (PDT) Received: by 10.150.137.11 with HTTP; Fri, 12 Sep 2008 08:50:41 -0700 (PDT) Message-ID: <5f67a8c40809120850u60c23fc4m7c4c1341fb2c4966@mail.gmail.com> Date: Fri, 12 Sep 2008 11:50:41 -0400 From: "Zaphod Beeblebrox" To: "Karl Pielorz" In-Reply-To: <3BE629D093001F6BA2C6791C@Slim64.dmpriest.net.uk> MIME-Version: 1.0 References: <20080912132102.GB56923@icarus.home.lan> <3BE629D093001F6BA2C6791C@Slim64.dmpriest.net.uk> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org, Jeremy Chadwick Subject: Re: ZFS w/failing drives - any equivalent of Solaris FMA? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Sep 2008 16:16:10 -0000 On Fri, Sep 12, 2008 at 10:34 AM, Karl Pielorz wrote: > --On 12 September 2008 06:21 -0700 Jeremy Chadwick > wrote: > > As far as I know, there is no such "standard" mechanism in FreeBSD. If >> the drive falls off the bus entirely (e.g. detached), I would hope ZFS >> would notice that. I can imagine it (might) also depend on if the disk >> subsystem you're using is utilising CAM or not (e.g. disks should be daX >> not adX); Scott Long might know if something like this is implemented in >> CAM. I'm fairly certain nothing like this is implemented in ata(4). >> > > For ATA, at the moment - I don't think it'll notice even if a drive > detaches. I think like my system the other day, it'll just keep issuing I/O > commands to the drive, even if it's disappeared (it might get much 'quicker > failures' if the device has 'gone' to the point of FreeBSD just quickly > returning 'fail' for every request). Since I had the opportunity, I tested this recently for both CAM and ATA. Now the RAID engine was gmirror in both cases (my production hardware doesn't do ZFS yet), but I expect the reaction to be somewhat the same. Both systems were Dell 1U's. One, an R200, had SATA disks attached to a plain SATA controller. I believe it may have supported RAID1, but I didn't use that functionality. When a drive was removed from it, it stalled for some time (30 minutes?) and then resumed working. by the time I could type on the machine again, gmirror had decided that the drive was gone and marked the mirror as degraded. The other system was a 1950-III with a SCSI SAS controller attached to an SAS hot-swap backplane. The drives themselves were 750G SATA drives. Yanking one of them resulted in about 5 seconds of disruption followed by gmirror realizing the problem and marking the mirror degraded. Neither system was heavily loaded during the test.