From owner-freebsd-scsi@freebsd.org  Sun Feb 14 12:59:50 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7878FAA063B;
 Sun, 14 Feb 2016 12:59:50 +0000 (UTC)
 (envelope-from tinkr@openmailbox.org)
Received: from mail2.openmailbox.org (mail2.openmailbox.org [62.4.1.33])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3B4D6125A;
 Sun, 14 Feb 2016 12:59:49 +0000 (UTC)
 (envelope-from tinkr@openmailbox.org)
Received: by mail2.openmailbox.org (Postfix, from userid 1004)
 id B72812AC23D8; Sun, 14 Feb 2016 13:59:40 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org;
 s=openmailbox; t=1455454780;
 bh=svkyVb8OHK6GP+S1g1xbl/EC9JMRQzemiJAwu9H+ovU=;
 h=Date:From:To:Subject:From;
 b=UEIko6W7oaPLdp7a4SQDtKMWjvb17/05HsTeWmJR8Spg1keMqy558StagN4nSXg++
 sFxbIOJt4V6kRncsQqSeDrhpvixzSkzp+hTM6qDfh9z0U8/qVKhbjTET3eeMgYPLyE
 eB163YufhMQpBYKTdtocYZF6i7x3hDhInbNjkt58=
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2
X-Spam-Level: 
X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50,
 DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0
Received: from www.openmailbox.org (openmailbox-b1 [10.91.69.218])
 by mail2.openmailbox.org (Postfix) with ESMTP id 97C902AC3C0E;
 Sun, 14 Feb 2016 13:59:30 +0100 (CET)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Sun, 14 Feb 2016 19:59:30 +0700
From: Tinker <tinkr@openmailbox.org>
To: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org,
 freebsd-fs@freebsd.org
Subject: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the
 Raid's physical drives break, how is it reported in the
 =?UTF-8?Q?logs=3F?=
Message-ID: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
X-Sender: tinkr@openmailbox.org
User-Agent: Roundcube Webmail/1.0.6
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Feb 2016 12:59:50 -0000

( ** Extremely sorry for crossposting! Was unclear where this RAID 
adapter question belongs, please clarify and I'll keep to one single 
list!
      Posted to all of stable@, scsi@ and fs@ .)


Hi,

When you run one of the MRSAS drives such as a Avatogech LSI MegaRaid 
9361 or 9266, and then eventually one of the physical RaidDrives or a 
CacheCade drives breaks, how is this reported to the FreeBSD host's 
dmesg or syslog?


I don't have the hardware in place so that I would be able to check. On 
the other hand someone among you may have extremely deep experience, in 
particular because this card is so common, so this is why I ask you 
here.

I understand that if at least one underlying copy of the data is 
accessible, the RAID card will optimize all access to that one, so when 
it comes to keeping IO working without interruption, the LSI card does a 
great job.


At some point, an SSD or HDD will break down, either completely (it 
won't connect and its SMART interface says the drive is consumed) or 
more discretely, through taking tons of time for its operations.

My best understanding is that the Raid card automatically will take 
those drives out of use, transparently. Now to the main point:


As admin, it's great to be informed when this happens i.e. an underlying 
physical Raid disk or a CacheCade disk is taken out of use or otherwise 
malfunctions.


Does the MrSas driver output this into the dmesg or syslog somehow?


Reading 
https://svnweb.freebsd.org/base/stable/10/sys/dev/mrsas/mrsas.c?revision=284267&view=markup 
, the card seems to have an "event log" that the driver downloads from 
the card in plaintext (??), but I don't understand from the sourcecode 
where that information is channeled. And also of course I can't see what 
that event log would contain in those cases.

(The "mfiutil" has a "show events" argument, though mfiutil is only for 
the related "mfi" driver which does not work for both 92XX and 93XX 
cards. Also in this case still I'd be interested to know how it reports 
a broken drive)

http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf 
on page 305, that is section "A.2 Event Messages" - I don't know for 
what LGI chip this document is, but, it does not list particular event 
message very clearly for when an individual underlying disk would have 
broken, I don't even see any event for when a hot spare would be taken 
in use!

You who have the experience, can you clarify please? Thanks :D

Tinker


From owner-freebsd-scsi@freebsd.org  Sun Feb 14 15:13:48 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2143DAA7FDE;
 Sun, 14 Feb 2016 15:13:48 +0000 (UTC)
 (envelope-from tinkr@openmailbox.org)
Received: from mail2.openmailbox.org (mail2.openmailbox.org [62.4.1.33])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id D13091E38;
 Sun, 14 Feb 2016 15:13:47 +0000 (UTC)
 (envelope-from tinkr@openmailbox.org)
Received: by mail2.openmailbox.org (Postfix, from userid 1004)
 id 78B892AC260D; Sun, 14 Feb 2016 16:13:43 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org;
 s=openmailbox; t=1455462823;
 bh=a6xRsHv3dB8Og6u7p4fjbM5qiUhvubkqMeI/6wnFUGk=;
 h=Date:From:To:Subject:In-Reply-To:References:From;
 b=BGg9woQZ2saaEnpPj7pRPuzJLQ/6mxc71q99ZNWGdj82+STcxUMZ0lO/68mXp6e8N
 /kP+z4YL/Pm2g5+z1B8kN41weu7n5aZMcEk2A4bRN0Rn8MwKFqNVcOOU8Ws5PkyJ6q
 Datw1/fbJ+OFpKv1M1qTy9TQ+/j3aXiYZrgUEU/s=
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2
X-Spam-Level: 
X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50,
 DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0
Received: from www.openmailbox.org (openmailbox-b2 [10.91.69.220])
 by mail2.openmailbox.org (Postfix) with ESMTP id C48662AC564D;
 Sun, 14 Feb 2016 16:13:31 +0100 (CET)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Sun, 14 Feb 2016 22:13:31 +0700
From: Tinker <tinkr@openmailbox.org>
To: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org,
 freebsd-fs@freebsd.org
Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of
 the Raid's physical drives break, how is it reported in the
 =?UTF-8?Q?logs=3F?=
In-Reply-To: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
Message-ID: <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
X-Sender: tinkr@openmailbox.org
User-Agent: Roundcube Webmail/1.0.6
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Feb 2016 15:13:48 -0000

(Will send any followup from now only to freebsd-scsi@ .)


Did some additional research and found that the disk failure indeed is 
reported in MRSAS' "event log".

So my final question then is, how do you extract it into userland (in 
the absence of an "mfiutil" as the MFI driver has)?


Details below. Thanks.

On 2016-02-14 19:59, Tinker wrote:
[...]
> http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
> on page 305, that is section "A.2 Event Messages" - I don't know for
> what LGI chip this document is, but, it does not list particular event
> message very clearly for when an individual underlying disk would have
> broken, I don't even see any event for when a hot spare would be taken
> in use!


Wait - this page:

https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array

(and also 
http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it 
)

gives an example of how the host system learns about broken disks:


Code: 0x00000051 .. Event Description: State change on VD 00/1 from 
OPTIMAL(3) to DEGRADED(2)


Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0) 
from ONLINE(18) to FAILED(11)

(unclean disk broken seems to be shown as:)

Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0) 
Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: 
b/00/00


And this version of the LSI documentation

http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf

gives a clearer definition of the physical and virtual drive states in 
"1.4.16 Physical Drive States"
and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12.

So as we see, a physical drive breaking would

  * "FAILED" the physical drive

  * "DEGRADED" the Virtual Drive (that is the logical exported drive) 
(from "OPTIMAL")


So then, it was indeed the card's "event log" that contains this info.


Last question then would only be then, *where* FreeBSD's MRSAS driver 
sends its event log?


From owner-freebsd-scsi@freebsd.org  Sun Feb 14 15:26:26 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6256FAA8670;
 Sun, 14 Feb 2016 15:26:26 +0000 (UTC) (envelope-from lists@opsec.eu)
Received: from home.opsec.eu (home.opsec.eu [IPv6:2001:14f8:200::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 2A32B172D;
 Sun, 14 Feb 2016 15:26:26 +0000 (UTC) (envelope-from lists@opsec.eu)
Received: from pi by home.opsec.eu with local (Exim 4.86 (FreeBSD))
 (envelope-from <lists@opsec.eu>)
 id 1aUyZ8-000BJ1-RF; Sun, 14 Feb 2016 16:26:26 +0100
Date: Sun, 14 Feb 2016 16:26:26 +0100
From: Kurt Jaeger <lists@opsec.eu>
To: Tinker <tinkr@openmailbox.org>
Cc: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org,
 freebsd-fs@freebsd.org
Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of
 the Raid's physical drives break, how is it reported in the logs?
Message-ID: <20160214152626.GH26283@home.opsec.eu>
References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
 <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Feb 2016 15:26:26 -0000

Hi!

> So my final question then is, how do you extract it into userland (in 
> the absence of an "mfiutil" as the MFI driver has)?

They renamed the util to StorCLI, it looks very similar to the old tw_cli,
and can be downloaded from

http://www.avagotech.com/products/server-storage/raid-controllers/megaraid-sas-9266-8i#downloads

as MR_SAS_StorCLI_1-16-06.zip, unpacking it yields storcli_all_os.zip,
unpacking that yields storcli_all_os/FreeBSD/storcli64.tar,
and finally unpacking that gives

$ file storcli64
storcli64: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, for FreeBSD 7.4, stripped

which at least looks like it might work with the MRSAS controller.

-- 
pi@opsec.eu            +49 171 3101372                         4 years to go !

From owner-freebsd-scsi@freebsd.org  Tue Feb 16 05:39:01 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 772A8AA9249
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Tue, 16 Feb 2016 05:39:01 +0000 (UTC)
 (envelope-from kashyap.desai@broadcom.com)
Received: from mail-lf0-x235.google.com (mail-lf0-x235.google.com
 [IPv6:2a00:1450:4010:c07::235])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 0D1761FF4
 for <freebsd-scsi@freebsd.org>; Tue, 16 Feb 2016 05:39:01 +0000 (UTC)
 (envelope-from kashyap.desai@broadcom.com)
Received: by mail-lf0-x235.google.com with SMTP id m1so101909544lfg.0
 for <freebsd-scsi@freebsd.org>; Mon, 15 Feb 2016 21:39:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google;
 h=from:references:in-reply-to:mime-version:thread-index:date
 :message-id:subject:to:content-type;
 bh=iAJawgYiieoFdQcH/VO6qvLuEmpo+2cYfhtwttxS5e8=;
 b=RAl83NzqMLv+zCrIQexxS450erFlVJjc8/84t+2XX5/oo44H5zBS6Y6G/TbnyvwuhH
 52CrmoIxyz/HoBJn0pU1d/NRIuIWXBfMvtyscUTKSE8UYaOwTZIfAPxa0RFdAlHchw0k
 z+cSVtEB0QoQNpkskTeqMDk9eBwGBEBTmA+PA=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:references:in-reply-to:mime-version
 :thread-index:date:message-id:subject:to:content-type;
 bh=iAJawgYiieoFdQcH/VO6qvLuEmpo+2cYfhtwttxS5e8=;
 b=HEvllp5sTSVsA1dq8fdRhrs97Oec5cGpBZUBb9Ydfc8DToJk91WUqtdXtYgFy33D9K
 Xm2s47nPSoBtwRJFoRrSw7hLZYmhMIREyzj8ArnrR2hcdTSj/PbKH5rCb2Jp6vKYsVF3
 YeqwnarUwr4uGqmbskcdIIkVlLS2sOOj1eaIV+QqZhqhbBZg5lEwymAgPWGogkSXqWH0
 93ME/X8W3VMloAyl0POce02rmalFNfDRWJ5PblUaGsL3MMoO1+tK5BNvaxpRRT3MxrtU
 T5SyE9R2gkUs1z3AJd8aRwMqZTEqSWEX2kd94ceBg9KGJjDvFsPMwOgh9Rob1mGmF86x
 bmug==
X-Gm-Message-State: AG10YOSUHhEpcv0VHUnMiMiksZ/lbCiEig6hwAbunZjoB2uw0yTnWaxRWZcWcGb/YzghAKQu0Nk56USlBMHFxKZU
X-Received: by 10.25.31.193 with SMTP id f184mr8775837lff.5.1455601138393;
 Mon, 15 Feb 2016 21:38:58 -0800 (PST)
From: Kashyap Desai <kashyap.desai@broadcom.com>
References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
 <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
In-Reply-To: <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
MIME-Version: 1.0
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQKkl8+O2HXYvg9L5h3WxDO8ryLRKQD7/aYNnX/od8A=
Date: Tue, 16 Feb 2016 11:08:57 +0530
Message-ID: <76cfa84fa2600ca7022cfd9635d06245@mail.gmail.com>
Subject: RE: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of
 the Raid's physical drives break, how is it reported in the logs?
To: Tinker <tinkr@openmailbox.org>, freebsd-scsi@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Feb 2016 05:39:01 -0000

Keeping only freebsd-scsi mailing list

> -----Original Message-----
> From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd-
> scsi@freebsd.org] On Behalf Of Tinker
> Sent: Sunday, February 14, 2016 8:44 PM
> To: freebsd-stable@freebsd.org; freebsd-scsi@freebsd.org; freebsd-
> fs@freebsd.org
> Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When
> one of the Raid's physical drives break, how is it reported in the logs?
>
> (Will send any followup from now only to freebsd-scsi@ .)
>
>
>
> Did some additional research and found that the disk failure indeed is
> reported in MRSAS' "event log".
>
> So my final question then is, how do you extract it into userland (in
the
> absence of an "mfiutil" as the MFI driver has)?

Are you using <mrsas> driver from Avago external portal or inbox freebsd
kernel ?
MRSAS driver has associated application to figure out such event in user
space. Can you  please post your query to Avago/Boradcom support team.

>
>
>
> Details below. Thanks.
>
> On 2016-02-14 19:59, Tinker wrote:
> [...]
> >
> http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-
> party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
> > on page 305, that is section "A.2 Event Messages" - I don't know for
> > what LGI chip this document is, but, it does not list particular event
> > message very clearly for when an individual underlying disk would have
> > broken, I don't even see any event for when a hot spare would be taken
> > in use!
>
>
> Wait - this page:
>
> https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaR
> AID+array
>
> (and also
>
http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-
> controller-does-not-detect-it
> )
>
> gives an example of how the host system learns about broken disks:
>
>
> Code: 0x00000051 .. Event Description: State change on VD 00/1 from
> OPTIMAL(3) to DEGRADED(2)
>
>
> Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0)
> from ONLINE(18) to FAILED(11)
>
> (unclean disk broken seems to be shown as:)
>
> Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0)
> Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense:
> b/00/00
>
>
> And this version of the LSI documentation
>
> http://hwraid.le-vert.net/raw-
> attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf
>
> gives a clearer definition of the physical and virtual drive states in
> "1.4.16 Physical Drive States"
> and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12.
>
> So as we see, a physical drive breaking would
>
>   * "FAILED" the physical drive
>
>   * "DEGRADED" the Virtual Drive (that is the logical exported drive)
> (from "OPTIMAL")
>
>
> So then, it was indeed the card's "event log" that contains this info.
>
>
>
> Last question then would only be then, *where* FreeBSD's MRSAS driver
> sends its event log?
>
>
>
> _______________________________________________
> freebsd-scsi@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"

From owner-freebsd-scsi@freebsd.org  Tue Feb 16 11:46:06 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7C17CAA91EA
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Tue, 16 Feb 2016 11:46:06 +0000 (UTC)
 (envelope-from david.ford@ouce.ox.ac.uk)
Received: from fallback2.mail.ox.ac.uk (fallback2.mail.ox.ac.uk [129.67.1.167])
 by mx1.freebsd.org (Postfix) with ESMTP id 4B1781A8C
 for <freebsd-scsi@freebsd.org>; Tue, 16 Feb 2016 11:46:05 +0000 (UTC)
 (envelope-from david.ford@ouce.ox.ac.uk)
Received: from relay12.mail.ox.ac.uk ([129.67.1.163])
 by fallback2.mail.ox.ac.uk with esmtp (Exim 4.80)
 (envelope-from <david.ford@ouce.ox.ac.uk>) id 1aVe4t-0002cc-88
 for freebsd-scsi@freebsd.org; Tue, 16 Feb 2016 11:45:59 +0000
Received: from hub06.nexus.ox.ac.uk ([163.1.154.240]
 helo=HUB06.ad.oak.ox.ac.uk)
 by relay12.mail.ox.ac.uk with esmtp (Exim 4.80)
 (envelope-from <david.ford@ouce.ox.ac.uk>) id 1aVe4h-00087g-fN
 for freebsd-scsi@freebsd.org; Tue, 16 Feb 2016 11:45:47 +0000
Received: from MBX01.ad.oak.ox.ac.uk ([169.254.1.95]) by HUB06.ad.oak.ox.ac.uk
 ([169.254.15.20]) with mapi id 14.03.0248.002;
 Tue, 16 Feb 2016 11:45:47 +0000
From: David Ford <david.ford@ouce.ox.ac.uk>
To: "'freebsd-scsi@freebsd.org'" <freebsd-scsi@freebsd.org>
Subject: camcontrol sata affiliations
Thread-Topic: camcontrol sata affiliations
Thread-Index: AdForTYiiodPtgJoTRCpk4BC+u40Tg==
Date: Tue, 16 Feb 2016 11:45:47 +0000
Message-ID: <D64F8B312592434E801895942B9101163F9C7A61@MBX01.ad.oak.ox.ac.uk>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [172.16.150.237]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Feb 2016 11:46:06 -0000

Hello,

I have a number of dual homed SAS disk chasses, with a mixture of SAS and S=
ATA drives. As expected, the SAS drives appear to both hosts, and the SATA =
drives appear on a single host, which gets the SAS affiliation.

>From the host with the SATA drive visible:

[root@backup-san1 ~]# camcontrol smpphylist /dev/ses0
26 PHYs:
PHY  Attached SAS Address
  0  0x0000000000000000
  1  0x0000000000000000
  2  0x50080e53c2b8f002   <ATA ST1000DM003-1ER1 CC45>       (da33,pass36)
  3  0x5000cca01ab1a139   <IBM-ESXS HUS723030ALS64 J210>    (pass0,da0)
  4  0x0000000000000000
  5  0x0000000000000000
  6  0x0000000000000000
  7  0x5000c50041affc01   <IBM-ESXS ST33000650SS BC36>      (pass2,da2)
  8  0x0000000000000000
  9  0x0000000000000000
 10  0x5000cca03ea41585   <IBM-ESXS HUS723030ALS64 J3K7>    (pass1,da1)
 11  0x0000000000000000
 12  0x500605b004f24f20
 13  0x500605b004f24f20
 14  0x500605b004f24f20
 15  0x500605b004f24f20
 16  0x0000000000000000
 17  0x0000000000000000
 18  0x0000000000000000
 19  0x0000000000000000
 20  0x0000000000000000
 21  0x0000000000000000
 22  0x0000000000000000
 23  0x0000000000000000
 24  0x50080e53c2b8f03d
 25  0x000000000000003e

>From the other host:

root@backup-san-02:~ # camcontrol smpphylist /dev/ses0
26 PHYs:
PHY  Attached SAS Address
  0  0x0000000000000000
  1  0x0000000000000000
  2  0x0000000000000000
  3  0x5000cca01ab1a13a   <IBM-ESXS HUS723030ALS64 J210>    (pass2,da1)
  4  0x0000000000000000
  5  0x0000000000000000
  6  0x0000000000000000
  7  0x5000c50041affc02   <IBM-ESXS ST33000650SS BC36>      (pass1,da0)
  8  0x0000000000000000
  9  0x0000000000000000
 10  0x5000cca03ea41586   <IBM-ESXS HUS723030ALS64 J3K7>    (pass3,da2)
 11  0x0000000000000000
 12  0x500605b004f27920
 13  0x500605b004f27920
 14  0x500605b004f27920
 15  0x500605b004f27920
 16  0x0000000000000000
 17  0x0000000000000000
 18  0x0000000000000000
 19  0x0000000000000000
 20  0x0000000000000000
 21  0x0000000000000000
 22  0x0000000000000000
 23  0x0000000000000000
 24  0x50080e53c1e1803d
 25  0x000000000000003e


I can successfully clear the affiliation:

[root@backup-san1 ~]# camcontrol  smppc /dev/ses0 -p 2 -o clearaffiliation
[root@backup-san1 ~]# smp_rep_phy_sata --phy=3D2 /dev/ses0
Report phy SATA response:
  expander change count: 74
  phy identifier: 2
  STP I_T nexus loss occurred: 0
  affiliations supported: 1
  affiliation valid: 0
  STP SAS address: 0x50080e53c2b8f002
  register device to host FIS:
    34 00 50 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00
  affiliated STP initiator SAS address: 0x0
  STP I_T nexus loss SAS address: 0x0
  affiliation context: 0
  current affiliation contexts: 0
  maximum affiliation contexts: 1

However from the other host:

root@backup-san-02:~ # camcontrol smppc /dev/ses0 -p 2 -o sataportsel

appears to do nothing - the output of camcontrol smpphylist /dev/ses0 and i=
t does not appear on a rescan, or if I attempt to hard reset it.

root@backup-san-02:~ # smp_rep_phy_sata --phy=3D2 /dev/ses0
Report phy SATA result: Phy does not support SATA

The systems are running Freebsd 10.2, and I have tested with both the mps a=
nd the mpr driver on different systems, the behaviour is identical.

Either I'm missing a crucial step in this process, or it's a bug. Does anyo=
ne have any suggestions.

Thanks

David

--=20
David Ford
IT Manager, School of Geography and the Environment
For general IT Support queries please contact itsupport@ouce.ox.ac.uk
Telephone: +44 1865 285089


From owner-freebsd-scsi@freebsd.org  Tue Feb 16 15:23:30 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93D75AAA422
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Tue, 16 Feb 2016 15:23:30 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: from mail-ob0-x22c.google.com (mail-ob0-x22c.google.com
 [IPv6:2607:f8b0:4003:c01::22c])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 5706D188D
 for <freebsd-scsi@freebsd.org>; Tue, 16 Feb 2016 15:23:30 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: by mail-ob0-x22c.google.com with SMTP id wb13so263793034obb.1
 for <freebsd-scsi@freebsd.org>; Tue, 16 Feb 2016 07:23:30 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=/KbkwsW93N9Vv2aTlYbayvbFZQfEojMb07+movbXb8I=;
 b=BNhG56HsC3NsPrv3b4VgUDld+CY1yZEYvSHI6NsOL+5MHgxqqalZ+1nwpVRm67Uy4p
 ryL4lG794j+X0b7oUcDRNJOobKk6AK5pgZ56xriY5lJ3CVpfXE9MpMJ7kExlcUmLLDRD
 t2MQSlESdsO1tzmtIMQg+vG2RoQjQsHjj4BDPLbFRE99wzmUUrKnQbDplQDH2+4lpbXr
 dhajGXOkn+GNLRQzsgoeXeDbXLJtVeLkMDIUbzMxjA8nBA5gOdYfAkzPLbeYsv9nhiPi
 8Y4XC979xTtE2XrgcO0j8Txc/UCbJzp9oUXQKYVpJFNDJ79n26Ex6n0cF6GDlqSWhbOQ
 KW+Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=/KbkwsW93N9Vv2aTlYbayvbFZQfEojMb07+movbXb8I=;
 b=mRHW+SJeb4IS7KFnyBg7JENEHFg/ZQ14ksftrUy1Q58FsDPSZYUxVzbm4LQJgBA4g9
 jAGpM07xeZiG2KWRpKnFMvepZ1mEjbGhJnfWw0KVzkUHVU2nyHZ6f8YDt2DOh7ZKdN49
 epk+cPbwO0d/bz8/KEKishqOolNcS2BR0Um8xumTXvkCXAcgI7SuhyGLpRDKZOc81tRH
 Jcf2wG/HlqGh2qPYkArbSXK68be88auRj3vevvm8PF5AzQTfxgZuYDssQZFXydRweobw
 n5yzCJGWZdGTaRT79Hr2q7iRcFsi/r3kqFxL4L31yAAxetpRPkX0/A/b8BM3GVNwkYuU
 9q4Q==
X-Gm-Message-State: AG10YOQobKCmHBwb5GdXGGh5EnZfUvfZsYOrgcJfxOcTET6L47xGbrgowWAzCJiCGHaplhaEd8gbzCK7ycZPZw==
MIME-Version: 1.0
X-Received: by 10.60.127.166 with SMTP id nh6mr17182885oeb.64.1455636181657;
 Tue, 16 Feb 2016 07:23:01 -0800 (PST)
Sender: asomers@gmail.com
Received: by 10.202.78.83 with HTTP; Tue, 16 Feb 2016 07:23:01 -0800 (PST)
In-Reply-To: <D64F8B312592434E801895942B9101163F9C7A61@MBX01.ad.oak.ox.ac.uk>
References: <D64F8B312592434E801895942B9101163F9C7A61@MBX01.ad.oak.ox.ac.uk>
Date: Tue, 16 Feb 2016 08:23:01 -0700
X-Google-Sender-Auth: c7paMePKHQPTOPfyDQmdN2wwtB4
Message-ID: <CAOtMX2jvVLUFHcpOs1r=Wcr_G3rKFStDQ92RatSsq-VAM9FDmQ@mail.gmail.com>
Subject: Re: camcontrol sata affiliations
From: Alan Somers <asomers@freebsd.org>
To: David Ford <david.ford@ouce.ox.ac.uk>
Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Feb 2016 15:23:30 -0000

On Tue, Feb 16, 2016 at 4:45 AM, David Ford <david.ford@ouce.ox.ac.uk> wrote:
> Hello,
>
> I have a number of dual homed SAS disk chasses, with a mixture of SAS and SATA drives. As expected, the SAS drives appear to both hosts, and the SATA drives appear on a single host, which gets the SAS affiliation.
>
> From the host with the SATA drive visible:
>
> [root@backup-san1 ~]# camcontrol smpphylist /dev/ses0
> 26 PHYs:
> PHY  Attached SAS Address
>   0  0x0000000000000000
>   1  0x0000000000000000
>   2  0x50080e53c2b8f002   <ATA ST1000DM003-1ER1 CC45>       (da33,pass36)
>   3  0x5000cca01ab1a139   <IBM-ESXS HUS723030ALS64 J210>    (pass0,da0)
>   4  0x0000000000000000
>   5  0x0000000000000000
>   6  0x0000000000000000
>   7  0x5000c50041affc01   <IBM-ESXS ST33000650SS BC36>      (pass2,da2)
>   8  0x0000000000000000
>   9  0x0000000000000000
>  10  0x5000cca03ea41585   <IBM-ESXS HUS723030ALS64 J3K7>    (pass1,da1)
>  11  0x0000000000000000
>  12  0x500605b004f24f20
>  13  0x500605b004f24f20
>  14  0x500605b004f24f20
>  15  0x500605b004f24f20
>  16  0x0000000000000000
>  17  0x0000000000000000
>  18  0x0000000000000000
>  19  0x0000000000000000
>  20  0x0000000000000000
>  21  0x0000000000000000
>  22  0x0000000000000000
>  23  0x0000000000000000
>  24  0x50080e53c2b8f03d
>  25  0x000000000000003e
>
> From the other host:
>
> root@backup-san-02:~ # camcontrol smpphylist /dev/ses0
> 26 PHYs:
> PHY  Attached SAS Address
>   0  0x0000000000000000
>   1  0x0000000000000000
>   2  0x0000000000000000
>   3  0x5000cca01ab1a13a   <IBM-ESXS HUS723030ALS64 J210>    (pass2,da1)
>   4  0x0000000000000000
>   5  0x0000000000000000
>   6  0x0000000000000000
>   7  0x5000c50041affc02   <IBM-ESXS ST33000650SS BC36>      (pass1,da0)
>   8  0x0000000000000000
>   9  0x0000000000000000
>  10  0x5000cca03ea41586   <IBM-ESXS HUS723030ALS64 J3K7>    (pass3,da2)
>  11  0x0000000000000000
>  12  0x500605b004f27920
>  13  0x500605b004f27920
>  14  0x500605b004f27920
>  15  0x500605b004f27920
>  16  0x0000000000000000
>  17  0x0000000000000000
>  18  0x0000000000000000
>  19  0x0000000000000000
>  20  0x0000000000000000
>  21  0x0000000000000000
>  22  0x0000000000000000
>  23  0x0000000000000000
>  24  0x50080e53c1e1803d
>  25  0x000000000000003e
>
>
> I can successfully clear the affiliation:
>
> [root@backup-san1 ~]# camcontrol  smppc /dev/ses0 -p 2 -o clearaffiliation
> [root@backup-san1 ~]# smp_rep_phy_sata --phy=2 /dev/ses0
> Report phy SATA response:
>   expander change count: 74
>   phy identifier: 2
>   STP I_T nexus loss occurred: 0
>   affiliations supported: 1
>   affiliation valid: 0
>   STP SAS address: 0x50080e53c2b8f002
>   register device to host FIS:
>     34 00 50 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00
>   affiliated STP initiator SAS address: 0x0
>   STP I_T nexus loss SAS address: 0x0
>   affiliation context: 0
>   current affiliation contexts: 0
>   maximum affiliation contexts: 1
>
> However from the other host:
>
> root@backup-san-02:~ # camcontrol smppc /dev/ses0 -p 2 -o sataportsel
>
> appears to do nothing - the output of camcontrol smpphylist /dev/ses0 and it does not appear on a rescan, or if I attempt to hard reset it.
>
> root@backup-san-02:~ # smp_rep_phy_sata --phy=2 /dev/ses0
> Report phy SATA result: Phy does not support SATA
>
> The systems are running Freebsd 10.2, and I have tested with both the mps and the mpr driver on different systems, the behaviour is identical.
>
> Either I'm missing a crucial step in this process, or it's a bug. Does anyone have any suggestions.
>
> Thanks
>
> David
>

You aren't missing anything.  This is just a difference between SATA
and SAS.  SAS drives have two ports, and SATA drives have only one.
Most (all?) multipath JBODs like yours have two separate expander
chips.  They connect every slot's first port to the first expander and
every slot's second port to the second expander.  That results in a
chassis with no SPOF.  With such hardware, there's no way to connect a
SATA drive to both servers.  And with more complicated hardware that
uses a single expander chip combined with SAS zoning to connect a SATA
drive to two servers, you're stuck with a SPOF.

-Alan

From owner-freebsd-scsi@freebsd.org  Tue Feb 16 15:32:41 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 37BE1AAA80D
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Tue, 16 Feb 2016 15:32:41 +0000 (UTC)
 (envelope-from david.ford@ouce.ox.ac.uk)
Received: from relay13.mail.ox.ac.uk (relay13.mail.ox.ac.uk [129.67.1.166])
 by mx1.freebsd.org (Postfix) with ESMTP id 07B821FEC;
 Tue, 16 Feb 2016 15:32:40 +0000 (UTC)
 (envelope-from david.ford@ouce.ox.ac.uk)
Received: from hub05.nexus.ox.ac.uk ([163.1.154.231]
 helo=HUB05.ad.oak.ox.ac.uk)
 by relay13.mail.ox.ac.uk with esmtp (Exim 4.80)
 (envelope-from <david.ford@ouce.ox.ac.uk>)
 id 1aVhc7-0005oy-gR; Tue, 16 Feb 2016 15:32:31 +0000
Received: from MBX01.ad.oak.ox.ac.uk ([169.254.1.95]) by HUB05.ad.oak.ox.ac.uk
 ([163.1.154.96]) with mapi id 14.03.0248.002;
 Tue, 16 Feb 2016 15:32:30 +0000
From: David Ford <david.ford@ouce.ox.ac.uk>
To: 'Alan Somers' <asomers@freebsd.org>
CC: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>
Subject: RE: camcontrol sata affiliations
Thread-Topic: camcontrol sata affiliations
Thread-Index: AdForTYiiodPtgJoTRCpk4BC+u40TgAILYGAAAAXiDA=
Date: Tue, 16 Feb 2016 15:32:30 +0000
Message-ID: <D64F8B312592434E801895942B9101163F9C7DFF@MBX01.ad.oak.ox.ac.uk>
References: <D64F8B312592434E801895942B9101163F9C7A61@MBX01.ad.oak.ox.ac.uk>
 <CAOtMX2jvVLUFHcpOs1r=Wcr_G3rKFStDQ92RatSsq-VAM9FDmQ@mail.gmail.com>
In-Reply-To: <CAOtMX2jvVLUFHcpOs1r=Wcr_G3rKFStDQ92RatSsq-VAM9FDmQ@mail.gmail.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [172.16.150.237]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Feb 2016 15:32:41 -0000

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBhc29tZXJzQGdtYWlsLmNvbSBb
bWFpbHRvOmFzb21lcnNAZ21haWwuY29tXSBPbiBCZWhhbGYgT2YgQWxhbiBTb21lcnMNCj4gU2Vu
dDogMTYgRmVicnVhcnkgMjAxNiAxNToyMw0KPiBUbzogRGF2aWQgRm9yZCA8ZGF2aWQuZm9yZEBv
dWNlLm94LmFjLnVrPg0KPiBDYzogZnJlZWJzZC1zY3NpQGZyZWVic2Qub3JnDQo+IFN1YmplY3Q6
IFJlOiBjYW1jb250cm9sIHNhdGEgYWZmaWxpYXRpb25zDQo+DQo+IFlvdSBhcmVuJ3QgbWlzc2lu
ZyBhbnl0aGluZy4gIFRoaXMgaXMganVzdCBhIGRpZmZlcmVuY2UgYmV0d2VlbiBTQVRBDQo+IGFu
ZCBTQVMuICBTQVMgZHJpdmVzIGhhdmUgdHdvIHBvcnRzLCBhbmQgU0FUQSBkcml2ZXMgaGF2ZSBv
bmx5IG9uZS4NCj4gTW9zdCAoYWxsPykgbXVsdGlwYXRoIEpCT0RzIGxpa2UgeW91cnMgaGF2ZSB0
d28gc2VwYXJhdGUgZXhwYW5kZXINCj4gY2hpcHMuICBUaGV5IGNvbm5lY3QgZXZlcnkgc2xvdCdz
IGZpcnN0IHBvcnQgdG8gdGhlIGZpcnN0IGV4cGFuZGVyIGFuZA0KPiBldmVyeSBzbG90J3Mgc2Vj
b25kIHBvcnQgdG8gdGhlIHNlY29uZCBleHBhbmRlci4gIFRoYXQgcmVzdWx0cyBpbiBhDQo+IGNo
YXNzaXMgd2l0aCBubyBTUE9GLiAgV2l0aCBzdWNoIGhhcmR3YXJlLCB0aGVyZSdzIG5vIHdheSB0
byBjb25uZWN0IGENCj4gU0FUQSBkcml2ZSB0byBib3RoIHNlcnZlcnMuICBBbmQgd2l0aCBtb3Jl
IGNvbXBsaWNhdGVkIGhhcmR3YXJlIHRoYXQNCj4gdXNlcyBhIHNpbmdsZSBleHBhbmRlciBjaGlw
IGNvbWJpbmVkIHdpdGggU0FTIHpvbmluZyB0byBjb25uZWN0IGEgU0FUQQ0KPiBkcml2ZSB0byB0
d28gc2VydmVycywgeW91J3JlIHN0dWNrIHdpdGggYSBTUE9GLg0KDQpUaGFua3MgdGhhdCBhdCBs
ZWFzdCBjbGFyaWZpZXMgd2hhdCdzIGdvaW5nIG9uLiBJIGhhZCB1bmRlcnN0b29kIG1vc3Qgb2Yg
dGhhdCwgDQpob3dldmVyIEkgdGhvdWdodCB0aGF0IHRoZSBpZGVhIG9mIHRoZSBhYmlsaXR5IHRv
IGNsZWFyIHRoZSBhZmZpbGlhdGlvbiANCndhcyB0byBwZXJtaXQgdGhlIHNlY29uZCBleHBhbmRl
ciB0byB0YWtlIG92ZXIuIEkgc3VzcGVjdCBJIHdhcyBtaXN0YWtlbi4NCg0KVGhhbmtzDQoNCkRh
dmlkDQoNCg==

From owner-freebsd-scsi@freebsd.org  Tue Feb 16 15:55:43 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 287E9AA91FB
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Tue, 16 Feb 2016 15:55:43 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: from mail-ob0-x22b.google.com (mail-ob0-x22b.google.com
 [IPv6:2607:f8b0:4003:c01::22b])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id DF3CE1C68
 for <freebsd-scsi@freebsd.org>; Tue, 16 Feb 2016 15:55:42 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: by mail-ob0-x22b.google.com with SMTP id gc3so163254183obb.3
 for <freebsd-scsi@freebsd.org>; Tue, 16 Feb 2016 07:55:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=uMJLk5Ncux4AGarFShHIEtX3DpOLhbW+7kabYd/uhMI=;
 b=ysUgpycxlp1U1BG3pGAoaqdBnmnJ1mNbXI/W9J8U+ptvrJy5M3hF5ssZFOhrwXnxxv
 hZ2NLEdumOjq2eyMEUeRS9FUgJZa1ofh1ixlMMudhGINRNyf2Xb4sEbWEvbYK27Ur6qD
 W86ojseC/fEqzCsUSoTKmjE7PRK00RQirNC42bVok27lHf8PFWI3YzKE5jYtU2eGic9f
 vUT2e7i4s48kMX39Zch+m3FL+BXqkXYt4JKJrAd/mtaP/vZkPN81fO5+EKH680RSvraR
 C0QybLOdVQ8Nt0Lu3SdYO22i/3UgwdOBEreNaEeG3mbskvcae2Dp25d5FBiVYafNaYIJ
 SjWQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=uMJLk5Ncux4AGarFShHIEtX3DpOLhbW+7kabYd/uhMI=;
 b=Mt9YsDRHj/TIG5W3VtMx5vtIMqD1gZla2n/tvm98krhAAIOlF426Lo/FhRwSrDuAnK
 atllWGPqGt6latdmIAh70UjsZI1uWkeinn3D7t6UTHH4fsam25cg5tHDmCk1YJKNT1bb
 LRNTvUGsib/uqqJFU3ZzfF1kh6+8/qDQKM42+MHI5v0nc4UmoPSlyJci5/4cshxfcDqd
 QidNBCwVi0d+1hqWeyCYtRtCyFd/Lcad8yY3IIXB311HoFMvckp8swShvoP7MKDAfQge
 VSDkhEYkxIKazoZ9dIUzxhXCXkzLMVPsJYfDlTZs0NK4jg/Y675hu+mkn20sXNbh5wjY
 s8jw==
X-Gm-Message-State: AG10YOTonZsnbvtJ5bGEgB8xtZmuPrg5vjnZsfLcxlqxjQeRPrvB7f/RqZToovONUFgix/BruVdvGRz7z3Dp1A==
MIME-Version: 1.0
X-Received: by 10.60.246.74 with SMTP id xu10mr17349068oec.31.1455638135303;
 Tue, 16 Feb 2016 07:55:35 -0800 (PST)
Sender: asomers@gmail.com
Received: by 10.202.78.83 with HTTP; Tue, 16 Feb 2016 07:55:35 -0800 (PST)
In-Reply-To: <D64F8B312592434E801895942B9101163F9C7DFF@MBX01.ad.oak.ox.ac.uk>
References: <D64F8B312592434E801895942B9101163F9C7A61@MBX01.ad.oak.ox.ac.uk>
 <CAOtMX2jvVLUFHcpOs1r=Wcr_G3rKFStDQ92RatSsq-VAM9FDmQ@mail.gmail.com>
 <D64F8B312592434E801895942B9101163F9C7DFF@MBX01.ad.oak.ox.ac.uk>
Date: Tue, 16 Feb 2016 08:55:35 -0700
X-Google-Sender-Auth: jk2mj6cN8vsT3_mehyyS69roVUU
Message-ID: <CAOtMX2hf_Od_Xdq6d8Hi70=gXdKMZPRusHbU22hxo=wwHNeWtg@mail.gmail.com>
Subject: Re: camcontrol sata affiliations
From: Alan Somers <asomers@freebsd.org>
To: David Ford <david.ford@ouce.ox.ac.uk>
Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Feb 2016 15:55:43 -0000

On Tue, Feb 16, 2016 at 8:32 AM, David Ford <david.ford@ouce.ox.ac.uk> wrote:
>> -----Original Message-----
>> From: asomers@gmail.com [mailto:asomers@gmail.com] On Behalf Of Alan Somers
>> Sent: 16 February 2016 15:23
>> To: David Ford <david.ford@ouce.ox.ac.uk>
>> Cc: freebsd-scsi@freebsd.org
>> Subject: Re: camcontrol sata affiliations
>>
>> You aren't missing anything.  This is just a difference between SATA
>> and SAS.  SAS drives have two ports, and SATA drives have only one.
>> Most (all?) multipath JBODs like yours have two separate expander
>> chips.  They connect every slot's first port to the first expander and
>> every slot's second port to the second expander.  That results in a
>> chassis with no SPOF.  With such hardware, there's no way to connect a
>> SATA drive to both servers.  And with more complicated hardware that
>> uses a single expander chip combined with SAS zoning to connect a SATA
>> drive to two servers, you're stuck with a SPOF.
>
> Thanks that at least clarifies what's going on. I had understood most of that,
> however I thought that the idea of the ability to clear the affiliation
> was to permit the second expander to take over. I suspect I was mistaken.
>
> Thanks
>
> David
>

Only if you have a JBOD that uses a single expander chip for both
hosts.  Then I think those commands would do what you want.

From owner-freebsd-scsi@freebsd.org  Wed Feb 17 00:01:11 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 058E5AAB088
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Wed, 17 Feb 2016 00:01:11 +0000 (UTC)
 (envelope-from ambrisko@ambrisko.com)
Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90])
 by mx1.freebsd.org (Postfix) with ESMTP id E4A622DA
 for <freebsd-scsi@freebsd.org>; Wed, 17 Feb 2016 00:01:10 +0000 (UTC)
 (envelope-from ambrisko@ambrisko.com)
X-Ambrisko-Me: Yes
Received: from server2.ambrisko.com (HELO internal.ambrisko.com)
 ([192.168.1.2])
 by ironport.ambrisko.com with ESMTP; 16 Feb 2016 16:14:48 -0800
Received: from ambrisko.com (localhost [127.0.0.1])
 by internal.ambrisko.com (8.14.9/8.14.4) with ESMTP id u1H002VC085902;
 Tue, 16 Feb 2016 16:00:02 -0800 (PST)
 (envelope-from ambrisko@ambrisko.com)
Received: (from ambrisko@localhost)
 by ambrisko.com (8.14.9/8.14.4/Submit) id u1H002Bs085890;
 Tue, 16 Feb 2016 16:00:02 -0800 (PST) (envelope-from ambrisko)
Date: Tue, 16 Feb 2016 16:00:02 -0800
From: Doug Ambrisko <ambrisko@ambrisko.com>
To: Tinker <tinkr@openmailbox.org>
Cc: freebsd-scsi@freebsd.org
Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of
 the Raid's physical drives break, how is it reported in the logs?
Message-ID: <20160217000002.GA81916@ambrisko.com>
References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
 <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Feb 2016 00:01:11 -0000

On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote:
| (Will send any followup from now only to freebsd-scsi@ .)
| 
| Did some additional research and found that the disk failure indeed is 
| reported in MRSAS' "event log".
| 
| So my final question then is, how do you extract it into userland (in 
| the absence of an "mfiutil" as the MFI driver has)?

I have local changes to print the event log in dmesg which gets sysloged.
We then watch syslog for issues to report things to our customers
automatically.  This is similar to mfi(4).

Thanks,

Doug A.
| Details below. Thanks.
| 
| On 2016-02-14 19:59, Tinker wrote:
| [...]
| > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
| > on page 305, that is section "A.2 Event Messages" - I don't know for
| > what LGI chip this document is, but, it does not list particular event
| > message very clearly for when an individual underlying disk would have
| > broken, I don't even see any event for when a hot spare would be taken
| > in use!
| 
| 
| Wait - this page:
| 
| https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array
| 
| (and also 
| http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it 
| )
| 
| gives an example of how the host system learns about broken disks:
| 
| 
| Code: 0x00000051 .. Event Description: State change on VD 00/1 from 
| OPTIMAL(3) to DEGRADED(2)
| 
| 
| Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0) 
| from ONLINE(18) to FAILED(11)
| 
| (unclean disk broken seems to be shown as:)
| 
| Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0) 
| Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: 
| b/00/00
| 
| 
| And this version of the LSI documentation
| 
| http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf
| 
| gives a clearer definition of the physical and virtual drive states in 
| "1.4.16 Physical Drive States"
| and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12.
| 
| So as we see, a physical drive breaking would
| 
|   * "FAILED" the physical drive
| 
|   * "DEGRADED" the Virtual Drive (that is the logical exported drive) 
| (from "OPTIMAL")
| 
| 
| So then, it was indeed the card's "event log" that contains this info.
| 
| 
| 
| Last question then would only be then, *where* FreeBSD's MRSAS driver 
| sends its event log?
| 
| 
| 
| _______________________________________________
| freebsd-stable@freebsd.org mailing list
| https://lists.freebsd.org/mailman/listinfo/freebsd-stable
| To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"

From owner-freebsd-scsi@freebsd.org  Wed Feb 17 08:15:06 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C604AAA976
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Wed, 17 Feb 2016 08:15:06 +0000 (UTC)
 (envelope-from tinkr@openmailbox.org)
Received: from smtp6.openmailbox.org (smtp6.openmailbox.org [62.4.1.40])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id CFB11E73
 for <freebsd-scsi@freebsd.org>; Wed, 17 Feb 2016 08:15:05 +0000 (UTC)
 (envelope-from tinkr@openmailbox.org)
Received: by mail2.openmailbox.org (Postfix, from userid 1004)
 id BCCAF2AC46FF; Wed, 17 Feb 2016 08:38:22 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org;
 s=openmailbox; t=1455694702;
 bh=sWHKZE5BOpoRV0FUlKivni4mKu7z1BW9pUiD5JGg5iw=;
 h=Date:From:To:Cc:Subject:In-Reply-To:References:From;
 b=Sezlo60fju+xp/mo2jPu+2WWGl/lpsZCnuKXh6g9eQBsMjyijX2SUZn1Qxg02bX/9
 qO64SV5mgoI3NjkrDnS7ag9kftjxpGaJXOFga0oTv/LoDKjGkEhVcOZRGD3LHwoZiE
 BHpMkQrHV4t6L4pjremfMEV3zbFflkJQkToN0IRQ=
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2
X-Spam-Level: 
X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50,
 DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0
Received: from www.openmailbox.org (openmailbox-b1 [10.91.69.218])
 by mail2.openmailbox.org (Postfix) with ESMTP id 8A70F2AC4B23;
 Wed, 17 Feb 2016 08:38:10 +0100 (CET)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Wed, 17 Feb 2016 14:38:10 +0700
From: Tinker <tinkr@openmailbox.org>
To: Doug Ambrisko <ambrisko@ambrisko.com>
Cc: freebsd-scsi@freebsd.org
Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of
 the Raid's physical drives break, how is it reported in the
 =?UTF-8?Q?logs=3F?=
In-Reply-To: <20160217000002.GA81916@ambrisko.com>
References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
 <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
 <20160217000002.GA81916@ambrisko.com>
Message-ID: <fceaf3867796102969153dea4a4cbbde@openmailbox.org>
X-Sender: tinkr@openmailbox.org
User-Agent: Roundcube Webmail/1.0.6
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Feb 2016 08:15:06 -0000

Hi Doug,

Would you mind sharing your kernel patch for that functionality (if I 
understand you right, you patched your kernel to channelize the events 
to the dmesg)?

Thanks,
Tinker

On 2016-02-17 07:00, Doug Ambrisko wrote:
> On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote:
> | (Will send any followup from now only to freebsd-scsi@ .)
> |
> | Did some additional research and found that the disk failure indeed 
> is
> | reported in MRSAS' "event log".
> |
> | So my final question then is, how do you extract it into userland (in
> | the absence of an "mfiutil" as the MFI driver has)?
> 
> I have local changes to print the event log in dmesg which gets 
> sysloged.
> We then watch syslog for issues to report things to our customers
> automatically.  This is similar to mfi(4).
> 
> Thanks,
> 
> Doug A.
> | Details below. Thanks.
> |
> | On 2016-02-14 19:59, Tinker wrote:
> | [...]
> | >
> http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
> | > on page 305, that is section "A.2 Event Messages" - I don't know 
> for
> | > what LGI chip this document is, but, it does not list particular 
> event
> | > message very clearly for when an individual underlying disk would 
> have
> | > broken, I don't even see any event for when a hot spare would be 
> taken
> | > in use!
> |
> |
> | Wait - this page:
> |
> | 
> https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array
> |
> | (and also
> |
> http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it
> | )
> |
> | gives an example of how the host system learns about broken disks:
> |
> |
> | Code: 0x00000051 .. Event Description: State change on VD 00/1 from
> | OPTIMAL(3) to DEGRADED(2)
> |
> |
> | Code: 0x00000072 .. Event Description: State change on PD 
> 05(e0xfc/s0)
> | from ONLINE(18) to FAILED(11)
> |
> | (unclean disk broken seems to be shown as:)
> |
> | Code: 0x00000071 .. Event Description: Unexpected sense: PD 
> 05(e0xfc/s0)
> | Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense:
> | b/00/00
> |
> |
> | And this version of the LSI documentation
> |
> |
> http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf
> |
> | gives a clearer definition of the physical and virtual drive states 
> in
> | "1.4.16 Physical Drive States"
> | and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12.
> |
> | So as we see, a physical drive breaking would
> |
> |   * "FAILED" the physical drive
> |
> |   * "DEGRADED" the Virtual Drive (that is the logical exported drive)
> | (from "OPTIMAL")
> |
> |
> | So then, it was indeed the card's "event log" that contains this 
> info.
> |
> |
> |
> | Last question then would only be then, *where* FreeBSD's MRSAS driver
> | sends its event log?
> |
> |
> |
> | _______________________________________________
> | freebsd-stable@freebsd.org mailing list
> | https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> | To unsubscribe, send any mail to 
> "freebsd-stable-unsubscribe@freebsd.org"


From owner-freebsd-scsi@freebsd.org  Thu Feb 18 17:33:31 2016
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7B5B5AAD5DA
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Thu, 18 Feb 2016 17:33:31 +0000 (UTC)
 (envelope-from ambrisko@ambrisko.com)
Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90])
 by mx1.freebsd.org (Postfix) with ESMTP id 56EA2AEC
 for <freebsd-scsi@freebsd.org>; Thu, 18 Feb 2016 17:33:31 +0000 (UTC)
 (envelope-from ambrisko@ambrisko.com)
X-Ambrisko-Me: Yes
Received: from server2.ambrisko.com (HELO internal.ambrisko.com)
 ([192.168.1.2])
 by ironport.ambrisko.com with ESMTP; 18 Feb 2016 09:48:10 -0800
Received: from ambrisko.com (localhost [127.0.0.1])
 by internal.ambrisko.com (8.14.9/8.14.4) with ESMTP id u1IHXPsf029514;
 Thu, 18 Feb 2016 09:33:25 -0800 (PST)
 (envelope-from ambrisko@ambrisko.com)
Received: (from ambrisko@localhost)
 by ambrisko.com (8.14.9/8.14.4/Submit) id u1IHXPlB029513;
 Thu, 18 Feb 2016 09:33:25 -0800 (PST) (envelope-from ambrisko)
Date: Thu, 18 Feb 2016 09:33:25 -0800
From: Doug Ambrisko <ambrisko@ambrisko.com>
To: Tinker <tinkr@openmailbox.org>
Cc: freebsd-scsi@freebsd.org
Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of
 the Raid's physical drives break, how is it reported in the logs?
Message-ID: <20160218173325.GA29200@ambrisko.com>
References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org>
 <55de137d1ed81930cfdbee579d881d62@openmailbox.org>
 <20160217000002.GA81916@ambrisko.com>
 <fceaf3867796102969153dea4a4cbbde@openmailbox.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <fceaf3867796102969153dea4a4cbbde@openmailbox.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Feb 2016 17:33:31 -0000

On Wed, Feb 17, 2016 at 02:38:10PM +0700, Tinker wrote:
| Hi Doug,
| 
| Would you mind sharing your kernel patch for that functionality (if I 
| understand you right, you patched your kernel to channelize the events 
| to the dmesg)?

I need to do some work on mrsas stuff at work, so I plan to sync our
changes to -current etc.

I'll send them to you.

Doug A.

| On 2016-02-17 07:00, Doug Ambrisko wrote:
| > On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote:
| > | (Will send any followup from now only to freebsd-scsi@ .)
| > |
| > | Did some additional research and found that the disk failure indeed 
| > is
| > | reported in MRSAS' "event log".
| > |
| > | So my final question then is, how do you extract it into userland (in
| > | the absence of an "mfiutil" as the MFI driver has)?
| > 
| > I have local changes to print the event log in dmesg which gets 
| > sysloged.
| > We then watch syslog for issues to report things to our customers
| > automatically.  This is similar to mfi(4).
| > 
| > Thanks,
| > 
| > Doug A.
| > | Details below. Thanks.
| > |
| > | On 2016-02-14 19:59, Tinker wrote:
| > | [...]
| > | >
| > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
| > | > on page 305, that is section "A.2 Event Messages" - I don't know 
| > for
| > | > what LGI chip this document is, but, it does not list particular 
| > event
| > | > message very clearly for when an individual underlying disk would 
| > have
| > | > broken, I don't even see any event for when a hot spare would be 
| > taken
| > | > in use!
| > |
| > |
| > | Wait - this page:
| > |
| > | 
| > https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array
| > |
| > | (and also
| > |
| > http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it
| > | )
| > |
| > | gives an example of how the host system learns about broken disks:
| > |
| > |
| > | Code: 0x00000051 .. Event Description: State change on VD 00/1 from
| > | OPTIMAL(3) to DEGRADED(2)
| > |
| > |
| > | Code: 0x00000072 .. Event Description: State change on PD 
| > 05(e0xfc/s0)
| > | from ONLINE(18) to FAILED(11)
| > |
| > | (unclean disk broken seems to be shown as:)
| > |
| > | Code: 0x00000071 .. Event Description: Unexpected sense: PD 
| > 05(e0xfc/s0)
| > | Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense:
| > | b/00/00
| > |
| > |
| > | And this version of the LSI documentation
| > |
| > |
| > http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf
| > |
| > | gives a clearer definition of the physical and virtual drive states 
| > in
| > | "1.4.16 Physical Drive States"
| > | and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12.
| > |
| > | So as we see, a physical drive breaking would
| > |
| > |   * "FAILED" the physical drive
| > |
| > |   * "DEGRADED" the Virtual Drive (that is the logical exported drive)
| > | (from "OPTIMAL")
| > |
| > |
| > | So then, it was indeed the card's "event log" that contains this 
| > info.
| > |
| > |
| > |
| > | Last question then would only be then, *where* FreeBSD's MRSAS driver
| > | sends its event log?
| > |
| > |
| > |
| > | _______________________________________________
| > | freebsd-stable@freebsd.org mailing list
| > | https://lists.freebsd.org/mailman/listinfo/freebsd-stable
| > | To unsubscribe, send any mail to 
| > "freebsd-stable-unsubscribe@freebsd.org"