From owner-freebsd-hardware@FreeBSD.ORG Fri Oct 15 22:49:30 2010 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 894E1106564A; Fri, 15 Oct 2010 22:49:30 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 2CAB08FC0A; Fri, 15 Oct 2010 22:49:29 +0000 (UTC) Received: by qwe4 with SMTP id 4so653961qwe.13 for ; Fri, 15 Oct 2010 15:49:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=6Qs6AIzSbGc+FC1crWIG+XXrWFwZNZUtvRD6KONE2MA=; b=NMBqNsN9oefGSRYEQ4h6ZyXnJU6yEDZcxPX1MIQuKdtW0a2xOd4U5A0tREyW8KIRlA oW85ElAimXpUX3gYDdtDcqxOUAVyFg1iL8DdRNaIhm5yuP3/eFPAT0JZa8TqRIggY/V+ fQ6sFIYFfOXDOd2CtfXdANgaWsVzgOp6UX5P8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=nS8z+7O9WrsZ9vjLPK1EXB1SzX6ML/GS/h+3VrMfevQ3pxFLOH7UDUhPV6237EnwTT 1aOfCLBgl6sRYJJTE4By01rBh1aydaipNDVj6ZjBS13tP7pMiRPn4gMSdEmINMd8gYOE S7nxTveZEnDSyYdr4lwIvS0bHcqMxgun5EZTM= MIME-Version: 1.0 Received: by 10.224.212.199 with SMTP id gt7mr365267qab.130.1287181137068; Fri, 15 Oct 2010 15:18:57 -0700 (PDT) Received: by 10.229.61.29 with HTTP; Fri, 15 Oct 2010 15:18:57 -0700 (PDT) In-Reply-To: <4CB8BED6.8040204@greatbaysoftware.com> References: <4CB8A614.6000707@greatbaysoftware.com> <4CB8BED6.8040204@greatbaysoftware.com> Date: Sat, 16 Oct 2010 02:18:57 +0400 Message-ID: From: Sergey Kandaurov To: Charles Owens Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Scott Long , freebsd-hardware@freebsd.org Subject: Re: mfiutil reports "PSTATE 0x0020" new drive state X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Oct 2010 22:49:30 -0000 On 16 October 2010 00:51, Charles Owens wrote= : > =A0Hmm... the problem appears to have resolved itself. =A0After a few hou= rs the > new drive seems to have gone back into the array, and the original hot sp= are > drive put back into hot-spare state. > > So I'm interpreting state 0x0020 to therefore mean something like "hang o= n > while I use this new drive to automatically put everything back as it was > before the failure". =A0Is this correct? > > Thanks, > Charles > > [root@Bsvr ~]# mfiutil show drives > mfi0 Physical Drives: > ( =A0149G) ONLINE =A0SATA enclosure 1= , slot 0 > ( =A0149G) ONLINE =A0SATA enclosure 1= , slot 1 > ( =A0149G) ONLINE =A0SATA enclosure 1= , slot 2 > ( =A0149G) HOT SPARE =A0SATA enclosur= e 1, slot > 3 > ( =A0149G) ONLINE =A0SATA enclosure 1= , slot 4 > > > > On 10/15/10 3:05 PM, Charles Owens wrote: >> >> =A0Hello, >> >> We have a mfi-based RAID array with a failed drive. =A0When replacing th= e >> failed drive with a brand new one 'mfiutil' reports it having status of >> "PSTATE 0x0020". =A0Attempts to work with the drive to make it a hot spa= re are >> unsuccessful (eg. using "good" and/or "add" subcommands of mfiutil). =A0= We've >> tested procedures for replacing failed drives in the past and haven't ru= n >> into this. >> >> Looking at the code for mfiutil it appears that this is happening becaus= e >> the mfi controller is reporting a drive status code that mfiutil doesn't >> know about. =A0The system is remote and in production, so booting into t= he LSI >> in-BIOS RAID-management-tool is not an attractive option. >> >> Any help with understanding the situation and potential next steps would >> be greatly appreciated. =A0More background information follows below. >> >> Thanks, >> >> Charles >> >> >> Storage configuration: =A04-drive RAID 10 array plus one hot spare >> >> [root@svr ~]# mfiutil show config >> mfi0 Configuration: 2 arrays, 1 volumes, 0 spares >> =A0 =A0array 0 of 2 drives: >> =A0 =A0 =A0 =A0drive 0 ( =A0149G) ONLINE =A0SATA >> enclosure 1, slot 0 >> =A0 =A0 =A0 =A0drive 1 ( =A0149G) ONLINE =A0SATA >> enclosure 1, slot 1 >> =A0 =A0array 1 of 2 drives: >> =A0 =A0 =A0 =A0drive 4 ( =A0149G) ONLINE =A0SATA >> enclosure 1, slot 3 >> =A0 =A0 =A0 =A0drive 3 ( =A0149G) ONLINE =A0SATA >> enclosure 1, slot 2 >> =A0 =A0volume mfid0 (296G) RAID-1 256K OPTIMAL spans: >> =A0 =A0 =A0 =A0array 0 >> =A0 =A0 =A0 =A0array 1 >> >> [root@svr ~]# mfiutil show drives >> mfi0 Physical Drives: >> ( =A0149G) ONLINE =A0SATA enclosure = 1, slot >> 0 >> ( =A0149G) ONLINE =A0SATA enclosure = 1, slot >> 1 >> ( =A0149G) ONLINE =A0SATA enclosure = 1, slot >> 2 >> ( =A0149G) ONLINE =A0SATA enclosure = 1, slot >> 3 >> ( =A0149G) PSTATE 0x0020 =A0SATA enc= losure >> 1, slot 4 >> >> mfi0: =A0port 0x1000-0x10ff mem >> ... >> Hi, Charles Owens. 0x20 is much likely to be the copyback physical state, which is missing in enum mfi_pd_state. And what you've experienced is copyback feature in action :) Your array has been rebuilt with HSP as its ordinal PD, then you switched failed drive with good one, and HSP came into copyback mode to move all its data back to good disk. That prevents reordering of disk numbers in array and double rebuilding. --=20 wbr, pluknet