From owner-freebsd-scsi@freebsd.org Sun Feb 14 12:59:50 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7878FAA063B; Sun, 14 Feb 2016 12:59:50 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: from mail2.openmailbox.org (mail2.openmailbox.org [62.4.1.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3B4D6125A; Sun, 14 Feb 2016 12:59:49 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: by mail2.openmailbox.org (Postfix, from userid 1004) id B72812AC23D8; Sun, 14 Feb 2016 13:59:40 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org; s=openmailbox; t=1455454780; bh=svkyVb8OHK6GP+S1g1xbl/EC9JMRQzemiJAwu9H+ovU=; h=Date:From:To:Subject:From; b=UEIko6W7oaPLdp7a4SQDtKMWjvb17/05HsTeWmJR8Spg1keMqy558StagN4nSXg++ sFxbIOJt4V6kRncsQqSeDrhpvixzSkzp+hTM6qDfh9z0U8/qVKhbjTET3eeMgYPLyE eB163YufhMQpBYKTdtocYZF6i7x3hDhInbNjkt58= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2 X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50, DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from www.openmailbox.org (openmailbox-b1 [10.91.69.218]) by mail2.openmailbox.org (Postfix) with ESMTP id 97C902AC3C0E; Sun, 14 Feb 2016 13:59:30 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Sun, 14 Feb 2016 19:59:30 +0700 From: Tinker To: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org, freebsd-fs@freebsd.org Subject: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the =?UTF-8?Q?logs=3F?= Message-ID: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> X-Sender: tinkr@openmailbox.org User-Agent: Roundcube Webmail/1.0.6 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Feb 2016 12:59:50 -0000 ( ** Extremely sorry for crossposting! Was unclear where this RAID adapter question belongs, please clarify and I'll keep to one single list! Posted to all of stable@, scsi@ and fs@ .) Hi, When you run one of the MRSAS drives such as a Avatogech LSI MegaRaid 9361 or 9266, and then eventually one of the physical RaidDrives or a CacheCade drives breaks, how is this reported to the FreeBSD host's dmesg or syslog? I don't have the hardware in place so that I would be able to check. On the other hand someone among you may have extremely deep experience, in particular because this card is so common, so this is why I ask you here. I understand that if at least one underlying copy of the data is accessible, the RAID card will optimize all access to that one, so when it comes to keeping IO working without interruption, the LSI card does a great job. At some point, an SSD or HDD will break down, either completely (it won't connect and its SMART interface says the drive is consumed) or more discretely, through taking tons of time for its operations. My best understanding is that the Raid card automatically will take those drives out of use, transparently. Now to the main point: As admin, it's great to be informed when this happens i.e. an underlying physical Raid disk or a CacheCade disk is taken out of use or otherwise malfunctions. Does the MrSas driver output this into the dmesg or syslog somehow? Reading https://svnweb.freebsd.org/base/stable/10/sys/dev/mrsas/mrsas.c?revision=284267&view=markup , the card seems to have an "event log" that the driver downloads from the card in plaintext (??), but I don't understand from the sourcecode where that information is channeled. And also of course I can't see what that event log would contain in those cases. (The "mfiutil" has a "show events" argument, though mfiutil is only for the related "mfi" driver which does not work for both 92XX and 93XX cards. Also in this case still I'd be interested to know how it reports a broken drive) http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf on page 305, that is section "A.2 Event Messages" - I don't know for what LGI chip this document is, but, it does not list particular event message very clearly for when an individual underlying disk would have broken, I don't even see any event for when a hot spare would be taken in use! You who have the experience, can you clarify please? Thanks :D Tinker From owner-freebsd-scsi@freebsd.org Sun Feb 14 15:13:48 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2143DAA7FDE; Sun, 14 Feb 2016 15:13:48 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: from mail2.openmailbox.org (mail2.openmailbox.org [62.4.1.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D13091E38; Sun, 14 Feb 2016 15:13:47 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: by mail2.openmailbox.org (Postfix, from userid 1004) id 78B892AC260D; Sun, 14 Feb 2016 16:13:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org; s=openmailbox; t=1455462823; bh=a6xRsHv3dB8Og6u7p4fjbM5qiUhvubkqMeI/6wnFUGk=; h=Date:From:To:Subject:In-Reply-To:References:From; b=BGg9woQZ2saaEnpPj7pRPuzJLQ/6mxc71q99ZNWGdj82+STcxUMZ0lO/68mXp6e8N /kP+z4YL/Pm2g5+z1B8kN41weu7n5aZMcEk2A4bRN0Rn8MwKFqNVcOOU8Ws5PkyJ6q Datw1/fbJ+OFpKv1M1qTy9TQ+/j3aXiYZrgUEU/s= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2 X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50, DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from www.openmailbox.org (openmailbox-b2 [10.91.69.220]) by mail2.openmailbox.org (Postfix) with ESMTP id C48662AC564D; Sun, 14 Feb 2016 16:13:31 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Sun, 14 Feb 2016 22:13:31 +0700 From: Tinker To: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org, freebsd-fs@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the =?UTF-8?Q?logs=3F?= In-Reply-To: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> Message-ID: <55de137d1ed81930cfdbee579d881d62@openmailbox.org> X-Sender: tinkr@openmailbox.org User-Agent: Roundcube Webmail/1.0.6 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Feb 2016 15:13:48 -0000 (Will send any followup from now only to freebsd-scsi@ .) Did some additional research and found that the disk failure indeed is reported in MRSAS' "event log". So my final question then is, how do you extract it into userland (in the absence of an "mfiutil" as the MFI driver has)? Details below. Thanks. On 2016-02-14 19:59, Tinker wrote: [...] > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf > on page 305, that is section "A.2 Event Messages" - I don't know for > what LGI chip this document is, but, it does not list particular event > message very clearly for when an individual underlying disk would have > broken, I don't even see any event for when a hot spare would be taken > in use! Wait - this page: https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array (and also http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it ) gives an example of how the host system learns about broken disks: Code: 0x00000051 .. Event Description: State change on VD 00/1 from OPTIMAL(3) to DEGRADED(2) Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0) from ONLINE(18) to FAILED(11) (unclean disk broken seems to be shown as:) Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0) Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: b/00/00 And this version of the LSI documentation http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf gives a clearer definition of the physical and virtual drive states in "1.4.16 Physical Drive States" and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12. So as we see, a physical drive breaking would * "FAILED" the physical drive * "DEGRADED" the Virtual Drive (that is the logical exported drive) (from "OPTIMAL") So then, it was indeed the card's "event log" that contains this info. Last question then would only be then, *where* FreeBSD's MRSAS driver sends its event log? From owner-freebsd-scsi@freebsd.org Sun Feb 14 15:26:26 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6256FAA8670; Sun, 14 Feb 2016 15:26:26 +0000 (UTC) (envelope-from lists@opsec.eu) Received: from home.opsec.eu (home.opsec.eu [IPv6:2001:14f8:200::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2A32B172D; Sun, 14 Feb 2016 15:26:26 +0000 (UTC) (envelope-from lists@opsec.eu) Received: from pi by home.opsec.eu with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1aUyZ8-000BJ1-RF; Sun, 14 Feb 2016 16:26:26 +0100 Date: Sun, 14 Feb 2016 16:26:26 +0100 From: Kurt Jaeger To: Tinker Cc: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org, freebsd-fs@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the logs? Message-ID: <20160214152626.GH26283@home.opsec.eu> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> <55de137d1ed81930cfdbee579d881d62@openmailbox.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55de137d1ed81930cfdbee579d881d62@openmailbox.org> X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Feb 2016 15:26:26 -0000 Hi! > So my final question then is, how do you extract it into userland (in > the absence of an "mfiutil" as the MFI driver has)? They renamed the util to StorCLI, it looks very similar to the old tw_cli, and can be downloaded from http://www.avagotech.com/products/server-storage/raid-controllers/megaraid-sas-9266-8i#downloads as MR_SAS_StorCLI_1-16-06.zip, unpacking it yields storcli_all_os.zip, unpacking that yields storcli_all_os/FreeBSD/storcli64.tar, and finally unpacking that gives $ file storcli64 storcli64: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, for FreeBSD 7.4, stripped which at least looks like it might work with the MRSAS controller. -- pi@opsec.eu +49 171 3101372 4 years to go ! From owner-freebsd-scsi@freebsd.org Tue Feb 16 05:39:01 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 772A8AA9249 for ; Tue, 16 Feb 2016 05:39:01 +0000 (UTC) (envelope-from kashyap.desai@broadcom.com) Received: from mail-lf0-x235.google.com (mail-lf0-x235.google.com [IPv6:2a00:1450:4010:c07::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0D1761FF4 for ; Tue, 16 Feb 2016 05:39:01 +0000 (UTC) (envelope-from kashyap.desai@broadcom.com) Received: by mail-lf0-x235.google.com with SMTP id m1so101909544lfg.0 for ; Mon, 15 Feb 2016 21:39:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:references:in-reply-to:mime-version:thread-index:date :message-id:subject:to:content-type; bh=iAJawgYiieoFdQcH/VO6qvLuEmpo+2cYfhtwttxS5e8=; b=RAl83NzqMLv+zCrIQexxS450erFlVJjc8/84t+2XX5/oo44H5zBS6Y6G/TbnyvwuhH 52CrmoIxyz/HoBJn0pU1d/NRIuIWXBfMvtyscUTKSE8UYaOwTZIfAPxa0RFdAlHchw0k z+cSVtEB0QoQNpkskTeqMDk9eBwGBEBTmA+PA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:references:in-reply-to:mime-version :thread-index:date:message-id:subject:to:content-type; bh=iAJawgYiieoFdQcH/VO6qvLuEmpo+2cYfhtwttxS5e8=; b=HEvllp5sTSVsA1dq8fdRhrs97Oec5cGpBZUBb9Ydfc8DToJk91WUqtdXtYgFy33D9K Xm2s47nPSoBtwRJFoRrSw7hLZYmhMIREyzj8ArnrR2hcdTSj/PbKH5rCb2Jp6vKYsVF3 YeqwnarUwr4uGqmbskcdIIkVlLS2sOOj1eaIV+QqZhqhbBZg5lEwymAgPWGogkSXqWH0 93ME/X8W3VMloAyl0POce02rmalFNfDRWJ5PblUaGsL3MMoO1+tK5BNvaxpRRT3MxrtU T5SyE9R2gkUs1z3AJd8aRwMqZTEqSWEX2kd94ceBg9KGJjDvFsPMwOgh9Rob1mGmF86x bmug== X-Gm-Message-State: AG10YOSUHhEpcv0VHUnMiMiksZ/lbCiEig6hwAbunZjoB2uw0yTnWaxRWZcWcGb/YzghAKQu0Nk56USlBMHFxKZU X-Received: by 10.25.31.193 with SMTP id f184mr8775837lff.5.1455601138393; Mon, 15 Feb 2016 21:38:58 -0800 (PST) From: Kashyap Desai References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> <55de137d1ed81930cfdbee579d881d62@openmailbox.org> In-Reply-To: <55de137d1ed81930cfdbee579d881d62@openmailbox.org> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQKkl8+O2HXYvg9L5h3WxDO8ryLRKQD7/aYNnX/od8A= Date: Tue, 16 Feb 2016 11:08:57 +0530 Message-ID: <76cfa84fa2600ca7022cfd9635d06245@mail.gmail.com> Subject: RE: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the logs? To: Tinker , freebsd-scsi@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Feb 2016 05:39:01 -0000 Keeping only freebsd-scsi mailing list > -----Original Message----- > From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd- > scsi@freebsd.org] On Behalf Of Tinker > Sent: Sunday, February 14, 2016 8:44 PM > To: freebsd-stable@freebsd.org; freebsd-scsi@freebsd.org; freebsd- > fs@freebsd.org > Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When > one of the Raid's physical drives break, how is it reported in the logs? > > (Will send any followup from now only to freebsd-scsi@ .) > > > > Did some additional research and found that the disk failure indeed is > reported in MRSAS' "event log". > > So my final question then is, how do you extract it into userland (in the > absence of an "mfiutil" as the MFI driver has)? Are you using driver from Avago external portal or inbox freebsd kernel ? MRSAS driver has associated application to figure out such event in user space. Can you please post your query to Avago/Boradcom support team. > > > > Details below. Thanks. > > On 2016-02-14 19:59, Tinker wrote: > [...] > > > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd- > party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf > > on page 305, that is section "A.2 Event Messages" - I don't know for > > what LGI chip this document is, but, it does not list particular event > > message very clearly for when an individual underlying disk would have > > broken, I don't even see any event for when a hot spare would be taken > > in use! > > > Wait - this page: > > https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaR > AID+array > > (and also > http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid- > controller-does-not-detect-it > ) > > gives an example of how the host system learns about broken disks: > > > Code: 0x00000051 .. Event Description: State change on VD 00/1 from > OPTIMAL(3) to DEGRADED(2) > > > Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0) > from ONLINE(18) to FAILED(11) > > (unclean disk broken seems to be shown as:) > > Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0) > Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: > b/00/00 > > > And this version of the LSI documentation > > http://hwraid.le-vert.net/raw- > attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf > > gives a clearer definition of the physical and virtual drive states in > "1.4.16 Physical Drive States" > and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12. > > So as we see, a physical drive breaking would > > * "FAILED" the physical drive > > * "DEGRADED" the Virtual Drive (that is the logical exported drive) > (from "OPTIMAL") > > > So then, it was indeed the card's "event log" that contains this info. > > > > Last question then would only be then, *where* FreeBSD's MRSAS driver > sends its event log? > > > > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" From owner-freebsd-scsi@freebsd.org Tue Feb 16 11:46:06 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7C17CAA91EA for ; Tue, 16 Feb 2016 11:46:06 +0000 (UTC) (envelope-from david.ford@ouce.ox.ac.uk) Received: from fallback2.mail.ox.ac.uk (fallback2.mail.ox.ac.uk [129.67.1.167]) by mx1.freebsd.org (Postfix) with ESMTP id 4B1781A8C for ; Tue, 16 Feb 2016 11:46:05 +0000 (UTC) (envelope-from david.ford@ouce.ox.ac.uk) Received: from relay12.mail.ox.ac.uk ([129.67.1.163]) by fallback2.mail.ox.ac.uk with esmtp (Exim 4.80) (envelope-from ) id 1aVe4t-0002cc-88 for freebsd-scsi@freebsd.org; Tue, 16 Feb 2016 11:45:59 +0000 Received: from hub06.nexus.ox.ac.uk ([163.1.154.240] helo=HUB06.ad.oak.ox.ac.uk) by relay12.mail.ox.ac.uk with esmtp (Exim 4.80) (envelope-from ) id 1aVe4h-00087g-fN for freebsd-scsi@freebsd.org; Tue, 16 Feb 2016 11:45:47 +0000 Received: from MBX01.ad.oak.ox.ac.uk ([169.254.1.95]) by HUB06.ad.oak.ox.ac.uk ([169.254.15.20]) with mapi id 14.03.0248.002; Tue, 16 Feb 2016 11:45:47 +0000 From: David Ford To: "'freebsd-scsi@freebsd.org'" Subject: camcontrol sata affiliations Thread-Topic: camcontrol sata affiliations Thread-Index: AdForTYiiodPtgJoTRCpk4BC+u40Tg== Date: Tue, 16 Feb 2016 11:45:47 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.150.237] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Feb 2016 11:46:06 -0000 Hello, I have a number of dual homed SAS disk chasses, with a mixture of SAS and S= ATA drives. As expected, the SAS drives appear to both hosts, and the SATA = drives appear on a single host, which gets the SAS affiliation. >From the host with the SATA drive visible: [root@backup-san1 ~]# camcontrol smpphylist /dev/ses0 26 PHYs: PHY Attached SAS Address 0 0x0000000000000000 1 0x0000000000000000 2 0x50080e53c2b8f002 (da33,pass36) 3 0x5000cca01ab1a139 (pass0,da0) 4 0x0000000000000000 5 0x0000000000000000 6 0x0000000000000000 7 0x5000c50041affc01 (pass2,da2) 8 0x0000000000000000 9 0x0000000000000000 10 0x5000cca03ea41585 (pass1,da1) 11 0x0000000000000000 12 0x500605b004f24f20 13 0x500605b004f24f20 14 0x500605b004f24f20 15 0x500605b004f24f20 16 0x0000000000000000 17 0x0000000000000000 18 0x0000000000000000 19 0x0000000000000000 20 0x0000000000000000 21 0x0000000000000000 22 0x0000000000000000 23 0x0000000000000000 24 0x50080e53c2b8f03d 25 0x000000000000003e >From the other host: root@backup-san-02:~ # camcontrol smpphylist /dev/ses0 26 PHYs: PHY Attached SAS Address 0 0x0000000000000000 1 0x0000000000000000 2 0x0000000000000000 3 0x5000cca01ab1a13a (pass2,da1) 4 0x0000000000000000 5 0x0000000000000000 6 0x0000000000000000 7 0x5000c50041affc02 (pass1,da0) 8 0x0000000000000000 9 0x0000000000000000 10 0x5000cca03ea41586 (pass3,da2) 11 0x0000000000000000 12 0x500605b004f27920 13 0x500605b004f27920 14 0x500605b004f27920 15 0x500605b004f27920 16 0x0000000000000000 17 0x0000000000000000 18 0x0000000000000000 19 0x0000000000000000 20 0x0000000000000000 21 0x0000000000000000 22 0x0000000000000000 23 0x0000000000000000 24 0x50080e53c1e1803d 25 0x000000000000003e I can successfully clear the affiliation: [root@backup-san1 ~]# camcontrol smppc /dev/ses0 -p 2 -o clearaffiliation [root@backup-san1 ~]# smp_rep_phy_sata --phy=3D2 /dev/ses0 Report phy SATA response: expander change count: 74 phy identifier: 2 STP I_T nexus loss occurred: 0 affiliations supported: 1 affiliation valid: 0 STP SAS address: 0x50080e53c2b8f002 register device to host FIS: 34 00 50 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 affiliated STP initiator SAS address: 0x0 STP I_T nexus loss SAS address: 0x0 affiliation context: 0 current affiliation contexts: 0 maximum affiliation contexts: 1 However from the other host: root@backup-san-02:~ # camcontrol smppc /dev/ses0 -p 2 -o sataportsel appears to do nothing - the output of camcontrol smpphylist /dev/ses0 and i= t does not appear on a rescan, or if I attempt to hard reset it. root@backup-san-02:~ # smp_rep_phy_sata --phy=3D2 /dev/ses0 Report phy SATA result: Phy does not support SATA The systems are running Freebsd 10.2, and I have tested with both the mps a= nd the mpr driver on different systems, the behaviour is identical. Either I'm missing a crucial step in this process, or it's a bug. Does anyo= ne have any suggestions. Thanks David --=20 David Ford IT Manager, School of Geography and the Environment For general IT Support queries please contact itsupport@ouce.ox.ac.uk Telephone: +44 1865 285089 From owner-freebsd-scsi@freebsd.org Tue Feb 16 15:23:30 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93D75AAA422 for ; Tue, 16 Feb 2016 15:23:30 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ob0-x22c.google.com (mail-ob0-x22c.google.com [IPv6:2607:f8b0:4003:c01::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5706D188D for ; Tue, 16 Feb 2016 15:23:30 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ob0-x22c.google.com with SMTP id wb13so263793034obb.1 for ; Tue, 16 Feb 2016 07:23:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=/KbkwsW93N9Vv2aTlYbayvbFZQfEojMb07+movbXb8I=; b=BNhG56HsC3NsPrv3b4VgUDld+CY1yZEYvSHI6NsOL+5MHgxqqalZ+1nwpVRm67Uy4p ryL4lG794j+X0b7oUcDRNJOobKk6AK5pgZ56xriY5lJ3CVpfXE9MpMJ7kExlcUmLLDRD t2MQSlESdsO1tzmtIMQg+vG2RoQjQsHjj4BDPLbFRE99wzmUUrKnQbDplQDH2+4lpbXr dhajGXOkn+GNLRQzsgoeXeDbXLJtVeLkMDIUbzMxjA8nBA5gOdYfAkzPLbeYsv9nhiPi 8Y4XC979xTtE2XrgcO0j8Txc/UCbJzp9oUXQKYVpJFNDJ79n26Ex6n0cF6GDlqSWhbOQ KW+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=/KbkwsW93N9Vv2aTlYbayvbFZQfEojMb07+movbXb8I=; b=mRHW+SJeb4IS7KFnyBg7JENEHFg/ZQ14ksftrUy1Q58FsDPSZYUxVzbm4LQJgBA4g9 jAGpM07xeZiG2KWRpKnFMvepZ1mEjbGhJnfWw0KVzkUHVU2nyHZ6f8YDt2DOh7ZKdN49 epk+cPbwO0d/bz8/KEKishqOolNcS2BR0Um8xumTXvkCXAcgI7SuhyGLpRDKZOc81tRH Jcf2wG/HlqGh2qPYkArbSXK68be88auRj3vevvm8PF5AzQTfxgZuYDssQZFXydRweobw n5yzCJGWZdGTaRT79Hr2q7iRcFsi/r3kqFxL4L31yAAxetpRPkX0/A/b8BM3GVNwkYuU 9q4Q== X-Gm-Message-State: AG10YOQobKCmHBwb5GdXGGh5EnZfUvfZsYOrgcJfxOcTET6L47xGbrgowWAzCJiCGHaplhaEd8gbzCK7ycZPZw== MIME-Version: 1.0 X-Received: by 10.60.127.166 with SMTP id nh6mr17182885oeb.64.1455636181657; Tue, 16 Feb 2016 07:23:01 -0800 (PST) Sender: asomers@gmail.com Received: by 10.202.78.83 with HTTP; Tue, 16 Feb 2016 07:23:01 -0800 (PST) In-Reply-To: References: Date: Tue, 16 Feb 2016 08:23:01 -0700 X-Google-Sender-Auth: c7paMePKHQPTOPfyDQmdN2wwtB4 Message-ID: Subject: Re: camcontrol sata affiliations From: Alan Somers To: David Ford Cc: "freebsd-scsi@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Feb 2016 15:23:30 -0000 On Tue, Feb 16, 2016 at 4:45 AM, David Ford wrote: > Hello, > > I have a number of dual homed SAS disk chasses, with a mixture of SAS and SATA drives. As expected, the SAS drives appear to both hosts, and the SATA drives appear on a single host, which gets the SAS affiliation. > > From the host with the SATA drive visible: > > [root@backup-san1 ~]# camcontrol smpphylist /dev/ses0 > 26 PHYs: > PHY Attached SAS Address > 0 0x0000000000000000 > 1 0x0000000000000000 > 2 0x50080e53c2b8f002 (da33,pass36) > 3 0x5000cca01ab1a139 (pass0,da0) > 4 0x0000000000000000 > 5 0x0000000000000000 > 6 0x0000000000000000 > 7 0x5000c50041affc01 (pass2,da2) > 8 0x0000000000000000 > 9 0x0000000000000000 > 10 0x5000cca03ea41585 (pass1,da1) > 11 0x0000000000000000 > 12 0x500605b004f24f20 > 13 0x500605b004f24f20 > 14 0x500605b004f24f20 > 15 0x500605b004f24f20 > 16 0x0000000000000000 > 17 0x0000000000000000 > 18 0x0000000000000000 > 19 0x0000000000000000 > 20 0x0000000000000000 > 21 0x0000000000000000 > 22 0x0000000000000000 > 23 0x0000000000000000 > 24 0x50080e53c2b8f03d > 25 0x000000000000003e > > From the other host: > > root@backup-san-02:~ # camcontrol smpphylist /dev/ses0 > 26 PHYs: > PHY Attached SAS Address > 0 0x0000000000000000 > 1 0x0000000000000000 > 2 0x0000000000000000 > 3 0x5000cca01ab1a13a (pass2,da1) > 4 0x0000000000000000 > 5 0x0000000000000000 > 6 0x0000000000000000 > 7 0x5000c50041affc02 (pass1,da0) > 8 0x0000000000000000 > 9 0x0000000000000000 > 10 0x5000cca03ea41586 (pass3,da2) > 11 0x0000000000000000 > 12 0x500605b004f27920 > 13 0x500605b004f27920 > 14 0x500605b004f27920 > 15 0x500605b004f27920 > 16 0x0000000000000000 > 17 0x0000000000000000 > 18 0x0000000000000000 > 19 0x0000000000000000 > 20 0x0000000000000000 > 21 0x0000000000000000 > 22 0x0000000000000000 > 23 0x0000000000000000 > 24 0x50080e53c1e1803d > 25 0x000000000000003e > > > I can successfully clear the affiliation: > > [root@backup-san1 ~]# camcontrol smppc /dev/ses0 -p 2 -o clearaffiliation > [root@backup-san1 ~]# smp_rep_phy_sata --phy=2 /dev/ses0 > Report phy SATA response: > expander change count: 74 > phy identifier: 2 > STP I_T nexus loss occurred: 0 > affiliations supported: 1 > affiliation valid: 0 > STP SAS address: 0x50080e53c2b8f002 > register device to host FIS: > 34 00 50 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 > affiliated STP initiator SAS address: 0x0 > STP I_T nexus loss SAS address: 0x0 > affiliation context: 0 > current affiliation contexts: 0 > maximum affiliation contexts: 1 > > However from the other host: > > root@backup-san-02:~ # camcontrol smppc /dev/ses0 -p 2 -o sataportsel > > appears to do nothing - the output of camcontrol smpphylist /dev/ses0 and it does not appear on a rescan, or if I attempt to hard reset it. > > root@backup-san-02:~ # smp_rep_phy_sata --phy=2 /dev/ses0 > Report phy SATA result: Phy does not support SATA > > The systems are running Freebsd 10.2, and I have tested with both the mps and the mpr driver on different systems, the behaviour is identical. > > Either I'm missing a crucial step in this process, or it's a bug. Does anyone have any suggestions. > > Thanks > > David > You aren't missing anything. This is just a difference between SATA and SAS. SAS drives have two ports, and SATA drives have only one. Most (all?) multipath JBODs like yours have two separate expander chips. They connect every slot's first port to the first expander and every slot's second port to the second expander. That results in a chassis with no SPOF. With such hardware, there's no way to connect a SATA drive to both servers. And with more complicated hardware that uses a single expander chip combined with SAS zoning to connect a SATA drive to two servers, you're stuck with a SPOF. -Alan From owner-freebsd-scsi@freebsd.org Tue Feb 16 15:32:41 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 37BE1AAA80D for ; Tue, 16 Feb 2016 15:32:41 +0000 (UTC) (envelope-from david.ford@ouce.ox.ac.uk) Received: from relay13.mail.ox.ac.uk (relay13.mail.ox.ac.uk [129.67.1.166]) by mx1.freebsd.org (Postfix) with ESMTP id 07B821FEC; Tue, 16 Feb 2016 15:32:40 +0000 (UTC) (envelope-from david.ford@ouce.ox.ac.uk) Received: from hub05.nexus.ox.ac.uk ([163.1.154.231] helo=HUB05.ad.oak.ox.ac.uk) by relay13.mail.ox.ac.uk with esmtp (Exim 4.80) (envelope-from ) id 1aVhc7-0005oy-gR; Tue, 16 Feb 2016 15:32:31 +0000 Received: from MBX01.ad.oak.ox.ac.uk ([169.254.1.95]) by HUB05.ad.oak.ox.ac.uk ([163.1.154.96]) with mapi id 14.03.0248.002; Tue, 16 Feb 2016 15:32:30 +0000 From: David Ford To: 'Alan Somers' CC: "freebsd-scsi@freebsd.org" Subject: RE: camcontrol sata affiliations Thread-Topic: camcontrol sata affiliations Thread-Index: AdForTYiiodPtgJoTRCpk4BC+u40TgAILYGAAAAXiDA= Date: Tue, 16 Feb 2016 15:32:30 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.150.237] Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Feb 2016 15:32:41 -0000 PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBhc29tZXJzQGdtYWlsLmNvbSBb bWFpbHRvOmFzb21lcnNAZ21haWwuY29tXSBPbiBCZWhhbGYgT2YgQWxhbiBTb21lcnMNCj4gU2Vu dDogMTYgRmVicnVhcnkgMjAxNiAxNToyMw0KPiBUbzogRGF2aWQgRm9yZCA8ZGF2aWQuZm9yZEBv dWNlLm94LmFjLnVrPg0KPiBDYzogZnJlZWJzZC1zY3NpQGZyZWVic2Qub3JnDQo+IFN1YmplY3Q6 IFJlOiBjYW1jb250cm9sIHNhdGEgYWZmaWxpYXRpb25zDQo+DQo+IFlvdSBhcmVuJ3QgbWlzc2lu ZyBhbnl0aGluZy4gIFRoaXMgaXMganVzdCBhIGRpZmZlcmVuY2UgYmV0d2VlbiBTQVRBDQo+IGFu ZCBTQVMuICBTQVMgZHJpdmVzIGhhdmUgdHdvIHBvcnRzLCBhbmQgU0FUQSBkcml2ZXMgaGF2ZSBv bmx5IG9uZS4NCj4gTW9zdCAoYWxsPykgbXVsdGlwYXRoIEpCT0RzIGxpa2UgeW91cnMgaGF2ZSB0 d28gc2VwYXJhdGUgZXhwYW5kZXINCj4gY2hpcHMuICBUaGV5IGNvbm5lY3QgZXZlcnkgc2xvdCdz IGZpcnN0IHBvcnQgdG8gdGhlIGZpcnN0IGV4cGFuZGVyIGFuZA0KPiBldmVyeSBzbG90J3Mgc2Vj b25kIHBvcnQgdG8gdGhlIHNlY29uZCBleHBhbmRlci4gIFRoYXQgcmVzdWx0cyBpbiBhDQo+IGNo YXNzaXMgd2l0aCBubyBTUE9GLiAgV2l0aCBzdWNoIGhhcmR3YXJlLCB0aGVyZSdzIG5vIHdheSB0 byBjb25uZWN0IGENCj4gU0FUQSBkcml2ZSB0byBib3RoIHNlcnZlcnMuICBBbmQgd2l0aCBtb3Jl IGNvbXBsaWNhdGVkIGhhcmR3YXJlIHRoYXQNCj4gdXNlcyBhIHNpbmdsZSBleHBhbmRlciBjaGlw IGNvbWJpbmVkIHdpdGggU0FTIHpvbmluZyB0byBjb25uZWN0IGEgU0FUQQ0KPiBkcml2ZSB0byB0 d28gc2VydmVycywgeW91J3JlIHN0dWNrIHdpdGggYSBTUE9GLg0KDQpUaGFua3MgdGhhdCBhdCBs ZWFzdCBjbGFyaWZpZXMgd2hhdCdzIGdvaW5nIG9uLiBJIGhhZCB1bmRlcnN0b29kIG1vc3Qgb2Yg dGhhdCwgDQpob3dldmVyIEkgdGhvdWdodCB0aGF0IHRoZSBpZGVhIG9mIHRoZSBhYmlsaXR5IHRv IGNsZWFyIHRoZSBhZmZpbGlhdGlvbiANCndhcyB0byBwZXJtaXQgdGhlIHNlY29uZCBleHBhbmRl ciB0byB0YWtlIG92ZXIuIEkgc3VzcGVjdCBJIHdhcyBtaXN0YWtlbi4NCg0KVGhhbmtzDQoNCkRh dmlkDQoNCg== From owner-freebsd-scsi@freebsd.org Tue Feb 16 15:55:43 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 287E9AA91FB for ; Tue, 16 Feb 2016 15:55:43 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ob0-x22b.google.com (mail-ob0-x22b.google.com [IPv6:2607:f8b0:4003:c01::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DF3CE1C68 for ; Tue, 16 Feb 2016 15:55:42 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ob0-x22b.google.com with SMTP id gc3so163254183obb.3 for ; Tue, 16 Feb 2016 07:55:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=uMJLk5Ncux4AGarFShHIEtX3DpOLhbW+7kabYd/uhMI=; b=ysUgpycxlp1U1BG3pGAoaqdBnmnJ1mNbXI/W9J8U+ptvrJy5M3hF5ssZFOhrwXnxxv hZ2NLEdumOjq2eyMEUeRS9FUgJZa1ofh1ixlMMudhGINRNyf2Xb4sEbWEvbYK27Ur6qD W86ojseC/fEqzCsUSoTKmjE7PRK00RQirNC42bVok27lHf8PFWI3YzKE5jYtU2eGic9f vUT2e7i4s48kMX39Zch+m3FL+BXqkXYt4JKJrAd/mtaP/vZkPN81fO5+EKH680RSvraR C0QybLOdVQ8Nt0Lu3SdYO22i/3UgwdOBEreNaEeG3mbskvcae2Dp25d5FBiVYafNaYIJ SjWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=uMJLk5Ncux4AGarFShHIEtX3DpOLhbW+7kabYd/uhMI=; b=Mt9YsDRHj/TIG5W3VtMx5vtIMqD1gZla2n/tvm98krhAAIOlF426Lo/FhRwSrDuAnK atllWGPqGt6latdmIAh70UjsZI1uWkeinn3D7t6UTHH4fsam25cg5tHDmCk1YJKNT1bb LRNTvUGsib/uqqJFU3ZzfF1kh6+8/qDQKM42+MHI5v0nc4UmoPSlyJci5/4cshxfcDqd QidNBCwVi0d+1hqWeyCYtRtCyFd/Lcad8yY3IIXB311HoFMvckp8swShvoP7MKDAfQge VSDkhEYkxIKazoZ9dIUzxhXCXkzLMVPsJYfDlTZs0NK4jg/Y675hu+mkn20sXNbh5wjY s8jw== X-Gm-Message-State: AG10YOTonZsnbvtJ5bGEgB8xtZmuPrg5vjnZsfLcxlqxjQeRPrvB7f/RqZToovONUFgix/BruVdvGRz7z3Dp1A== MIME-Version: 1.0 X-Received: by 10.60.246.74 with SMTP id xu10mr17349068oec.31.1455638135303; Tue, 16 Feb 2016 07:55:35 -0800 (PST) Sender: asomers@gmail.com Received: by 10.202.78.83 with HTTP; Tue, 16 Feb 2016 07:55:35 -0800 (PST) In-Reply-To: References: Date: Tue, 16 Feb 2016 08:55:35 -0700 X-Google-Sender-Auth: jk2mj6cN8vsT3_mehyyS69roVUU Message-ID: Subject: Re: camcontrol sata affiliations From: Alan Somers To: David Ford Cc: "freebsd-scsi@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Feb 2016 15:55:43 -0000 On Tue, Feb 16, 2016 at 8:32 AM, David Ford wrote: >> -----Original Message----- >> From: asomers@gmail.com [mailto:asomers@gmail.com] On Behalf Of Alan Somers >> Sent: 16 February 2016 15:23 >> To: David Ford >> Cc: freebsd-scsi@freebsd.org >> Subject: Re: camcontrol sata affiliations >> >> You aren't missing anything. This is just a difference between SATA >> and SAS. SAS drives have two ports, and SATA drives have only one. >> Most (all?) multipath JBODs like yours have two separate expander >> chips. They connect every slot's first port to the first expander and >> every slot's second port to the second expander. That results in a >> chassis with no SPOF. With such hardware, there's no way to connect a >> SATA drive to both servers. And with more complicated hardware that >> uses a single expander chip combined with SAS zoning to connect a SATA >> drive to two servers, you're stuck with a SPOF. > > Thanks that at least clarifies what's going on. I had understood most of that, > however I thought that the idea of the ability to clear the affiliation > was to permit the second expander to take over. I suspect I was mistaken. > > Thanks > > David > Only if you have a JBOD that uses a single expander chip for both hosts. Then I think those commands would do what you want. From owner-freebsd-scsi@freebsd.org Wed Feb 17 00:01:11 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 058E5AAB088 for ; Wed, 17 Feb 2016 00:01:11 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id E4A622DA for ; Wed, 17 Feb 2016 00:01:10 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 16 Feb 2016 16:14:48 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.9/8.14.4) with ESMTP id u1H002VC085902; Tue, 16 Feb 2016 16:00:02 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.9/8.14.4/Submit) id u1H002Bs085890; Tue, 16 Feb 2016 16:00:02 -0800 (PST) (envelope-from ambrisko) Date: Tue, 16 Feb 2016 16:00:02 -0800 From: Doug Ambrisko To: Tinker Cc: freebsd-scsi@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the logs? Message-ID: <20160217000002.GA81916@ambrisko.com> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> <55de137d1ed81930cfdbee579d881d62@openmailbox.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55de137d1ed81930cfdbee579d881d62@openmailbox.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Feb 2016 00:01:11 -0000 On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote: | (Will send any followup from now only to freebsd-scsi@ .) | | Did some additional research and found that the disk failure indeed is | reported in MRSAS' "event log". | | So my final question then is, how do you extract it into userland (in | the absence of an "mfiutil" as the MFI driver has)? I have local changes to print the event log in dmesg which gets sysloged. We then watch syslog for issues to report things to our customers automatically. This is similar to mfi(4). Thanks, Doug A. | Details below. Thanks. | | On 2016-02-14 19:59, Tinker wrote: | [...] | > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf | > on page 305, that is section "A.2 Event Messages" - I don't know for | > what LGI chip this document is, but, it does not list particular event | > message very clearly for when an individual underlying disk would have | > broken, I don't even see any event for when a hot spare would be taken | > in use! | | | Wait - this page: | | https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array | | (and also | http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it | ) | | gives an example of how the host system learns about broken disks: | | | Code: 0x00000051 .. Event Description: State change on VD 00/1 from | OPTIMAL(3) to DEGRADED(2) | | | Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0) | from ONLINE(18) to FAILED(11) | | (unclean disk broken seems to be shown as:) | | Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0) | Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: | b/00/00 | | | And this version of the LSI documentation | | http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf | | gives a clearer definition of the physical and virtual drive states in | "1.4.16 Physical Drive States" | and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12. | | So as we see, a physical drive breaking would | | * "FAILED" the physical drive | | * "DEGRADED" the Virtual Drive (that is the logical exported drive) | (from "OPTIMAL") | | | So then, it was indeed the card's "event log" that contains this info. | | | | Last question then would only be then, *where* FreeBSD's MRSAS driver | sends its event log? | | | | _______________________________________________ | freebsd-stable@freebsd.org mailing list | https://lists.freebsd.org/mailman/listinfo/freebsd-stable | To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-scsi@freebsd.org Wed Feb 17 08:15:06 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C604AAA976 for ; Wed, 17 Feb 2016 08:15:06 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: from smtp6.openmailbox.org (smtp6.openmailbox.org [62.4.1.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CFB11E73 for ; Wed, 17 Feb 2016 08:15:05 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: by mail2.openmailbox.org (Postfix, from userid 1004) id BCCAF2AC46FF; Wed, 17 Feb 2016 08:38:22 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org; s=openmailbox; t=1455694702; bh=sWHKZE5BOpoRV0FUlKivni4mKu7z1BW9pUiD5JGg5iw=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Sezlo60fju+xp/mo2jPu+2WWGl/lpsZCnuKXh6g9eQBsMjyijX2SUZn1Qxg02bX/9 qO64SV5mgoI3NjkrDnS7ag9kftjxpGaJXOFga0oTv/LoDKjGkEhVcOZRGD3LHwoZiE BHpMkQrHV4t6L4pjremfMEV3zbFflkJQkToN0IRQ= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2 X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50, DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from www.openmailbox.org (openmailbox-b1 [10.91.69.218]) by mail2.openmailbox.org (Postfix) with ESMTP id 8A70F2AC4B23; Wed, 17 Feb 2016 08:38:10 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 17 Feb 2016 14:38:10 +0700 From: Tinker To: Doug Ambrisko Cc: freebsd-scsi@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the =?UTF-8?Q?logs=3F?= In-Reply-To: <20160217000002.GA81916@ambrisko.com> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> <55de137d1ed81930cfdbee579d881d62@openmailbox.org> <20160217000002.GA81916@ambrisko.com> Message-ID: X-Sender: tinkr@openmailbox.org User-Agent: Roundcube Webmail/1.0.6 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Feb 2016 08:15:06 -0000 Hi Doug, Would you mind sharing your kernel patch for that functionality (if I understand you right, you patched your kernel to channelize the events to the dmesg)? Thanks, Tinker On 2016-02-17 07:00, Doug Ambrisko wrote: > On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote: > | (Will send any followup from now only to freebsd-scsi@ .) > | > | Did some additional research and found that the disk failure indeed > is > | reported in MRSAS' "event log". > | > | So my final question then is, how do you extract it into userland (in > | the absence of an "mfiutil" as the MFI driver has)? > > I have local changes to print the event log in dmesg which gets > sysloged. > We then watch syslog for issues to report things to our customers > automatically. This is similar to mfi(4). > > Thanks, > > Doug A. > | Details below. Thanks. > | > | On 2016-02-14 19:59, Tinker wrote: > | [...] > | > > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf > | > on page 305, that is section "A.2 Event Messages" - I don't know > for > | > what LGI chip this document is, but, it does not list particular > event > | > message very clearly for when an individual underlying disk would > have > | > broken, I don't even see any event for when a hot spare would be > taken > | > in use! > | > | > | Wait - this page: > | > | > https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array > | > | (and also > | > http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it > | ) > | > | gives an example of how the host system learns about broken disks: > | > | > | Code: 0x00000051 .. Event Description: State change on VD 00/1 from > | OPTIMAL(3) to DEGRADED(2) > | > | > | Code: 0x00000072 .. Event Description: State change on PD > 05(e0xfc/s0) > | from ONLINE(18) to FAILED(11) > | > | (unclean disk broken seems to be shown as:) > | > | Code: 0x00000071 .. Event Description: Unexpected sense: PD > 05(e0xfc/s0) > | Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: > | b/00/00 > | > | > | And this version of the LSI documentation > | > | > http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf > | > | gives a clearer definition of the physical and virtual drive states > in > | "1.4.16 Physical Drive States" > | and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12. > | > | So as we see, a physical drive breaking would > | > | * "FAILED" the physical drive > | > | * "DEGRADED" the Virtual Drive (that is the logical exported drive) > | (from "OPTIMAL") > | > | > | So then, it was indeed the card's "event log" that contains this > info. > | > | > | > | Last question then would only be then, *where* FreeBSD's MRSAS driver > | sends its event log? > | > | > | > | _______________________________________________ > | freebsd-stable@freebsd.org mailing list > | https://lists.freebsd.org/mailman/listinfo/freebsd-stable > | To unsubscribe, send any mail to > "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-scsi@freebsd.org Thu Feb 18 17:33:31 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7B5B5AAD5DA for ; Thu, 18 Feb 2016 17:33:31 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 56EA2AEC for ; Thu, 18 Feb 2016 17:33:31 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 18 Feb 2016 09:48:10 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.9/8.14.4) with ESMTP id u1IHXPsf029514; Thu, 18 Feb 2016 09:33:25 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.9/8.14.4/Submit) id u1IHXPlB029513; Thu, 18 Feb 2016 09:33:25 -0800 (PST) (envelope-from ambrisko) Date: Thu, 18 Feb 2016 09:33:25 -0800 From: Doug Ambrisko To: Tinker Cc: freebsd-scsi@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the logs? Message-ID: <20160218173325.GA29200@ambrisko.com> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> <55de137d1ed81930cfdbee579d881d62@openmailbox.org> <20160217000002.GA81916@ambrisko.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Feb 2016 17:33:31 -0000 On Wed, Feb 17, 2016 at 02:38:10PM +0700, Tinker wrote: | Hi Doug, | | Would you mind sharing your kernel patch for that functionality (if I | understand you right, you patched your kernel to channelize the events | to the dmesg)? I need to do some work on mrsas stuff at work, so I plan to sync our changes to -current etc. I'll send them to you. Doug A. | On 2016-02-17 07:00, Doug Ambrisko wrote: | > On Sun, Feb 14, 2016 at 10:13:31PM +0700, Tinker wrote: | > | (Will send any followup from now only to freebsd-scsi@ .) | > | | > | Did some additional research and found that the disk failure indeed | > is | > | reported in MRSAS' "event log". | > | | > | So my final question then is, how do you extract it into userland (in | > | the absence of an "mfiutil" as the MFI driver has)? | > | > I have local changes to print the event log in dmesg which gets | > sysloged. | > We then watch syslog for issues to report things to our customers | > automatically. This is similar to mfi(4). | > | > Thanks, | > | > Doug A. | > | Details below. Thanks. | > | | > | On 2016-02-14 19:59, Tinker wrote: | > | [...] | > | > | > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf | > | > on page 305, that is section "A.2 Event Messages" - I don't know | > for | > | > what LGI chip this document is, but, it does not list particular | > event | > | > message very clearly for when an individual underlying disk would | > have | > | > broken, I don't even see any event for when a hot spare would be | > taken | > | > in use! | > | | > | | > | Wait - this page: | > | | > | | > https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array | > | | > | (and also | > | | > http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it | > | ) | > | | > | gives an example of how the host system learns about broken disks: | > | | > | | > | Code: 0x00000051 .. Event Description: State change on VD 00/1 from | > | OPTIMAL(3) to DEGRADED(2) | > | | > | | > | Code: 0x00000072 .. Event Description: State change on PD | > 05(e0xfc/s0) | > | from ONLINE(18) to FAILED(11) | > | | > | (unclean disk broken seems to be shown as:) | > | | > | Code: 0x00000071 .. Event Description: Unexpected sense: PD | > 05(e0xfc/s0) | > | Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: | > | b/00/00 | > | | > | | > | And this version of the LSI documentation | > | | > | | > http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf | > | | > | gives a clearer definition of the physical and virtual drive states | > in | > | "1.4.16 Physical Drive States" | > | and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12. | > | | > | So as we see, a physical drive breaking would | > | | > | * "FAILED" the physical drive | > | | > | * "DEGRADED" the Virtual Drive (that is the logical exported drive) | > | (from "OPTIMAL") | > | | > | | > | So then, it was indeed the card's "event log" that contains this | > info. | > | | > | | > | | > | Last question then would only be then, *where* FreeBSD's MRSAS driver | > | sends its event log? | > | | > | | > | | > | _______________________________________________ | > | freebsd-stable@freebsd.org mailing list | > | https://lists.freebsd.org/mailman/listinfo/freebsd-stable | > | To unsubscribe, send any mail to | > "freebsd-stable-unsubscribe@freebsd.org"