From owner-freebsd-stable@freebsd.org Sun Feb 14 15:13:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2143DAA7FDE; Sun, 14 Feb 2016 15:13:48 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: from mail2.openmailbox.org (mail2.openmailbox.org [62.4.1.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D13091E38; Sun, 14 Feb 2016 15:13:47 +0000 (UTC) (envelope-from tinkr@openmailbox.org) Received: by mail2.openmailbox.org (Postfix, from userid 1004) id 78B892AC260D; Sun, 14 Feb 2016 16:13:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=openmailbox.org; s=openmailbox; t=1455462823; bh=a6xRsHv3dB8Og6u7p4fjbM5qiUhvubkqMeI/6wnFUGk=; h=Date:From:To:Subject:In-Reply-To:References:From; b=BGg9woQZ2saaEnpPj7pRPuzJLQ/6mxc71q99ZNWGdj82+STcxUMZ0lO/68mXp6e8N /kP+z4YL/Pm2g5+z1B8kN41weu7n5aZMcEk2A4bRN0Rn8MwKFqNVcOOU8Ws5PkyJ6q Datw1/fbJ+OFpKv1M1qTy9TQ+/j3aXiYZrgUEU/s= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on openmailbox-b2 X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ALL_TRUSTED,BAYES_50, DKIM_ADSP_ALL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from www.openmailbox.org (openmailbox-b2 [10.91.69.220]) by mail2.openmailbox.org (Postfix) with ESMTP id C48662AC564D; Sun, 14 Feb 2016 16:13:31 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Sun, 14 Feb 2016 22:13:31 +0700 From: Tinker To: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org, freebsd-fs@freebsd.org Subject: Re: MRSAS driver/LSI MegaRaid 92XX-93XX admin question: When one of the Raid's physical drives break, how is it reported in the =?UTF-8?Q?logs=3F?= In-Reply-To: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> References: <6a648d421b6d611b4f6f411b66303017@openmailbox.org> Message-ID: <55de137d1ed81930cfdbee579d881d62@openmailbox.org> X-Sender: tinkr@openmailbox.org User-Agent: Roundcube Webmail/1.0.6 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Feb 2016 15:13:48 -0000 (Will send any followup from now only to freebsd-scsi@ .) Did some additional research and found that the disk failure indeed is reported in MRSAS' "event log". So my final question then is, how do you extract it into userland (in the absence of an "mfiutil" as the MFI driver has)? Details below. Thanks. On 2016-02-14 19:59, Tinker wrote: [...] > http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf > on page 305, that is section "A.2 Event Messages" - I don't know for > what LGI chip this document is, but, it does not list particular event > message very clearly for when an individual underlying disk would have > broken, I don't even see any event for when a hot spare would be taken > in use! Wait - this page: https://www.schirmacher.de/display/Linux/Replace+failed+disk+in+MegaRAID+array (and also http://serverfault.com/questions/485147/drive-is-failing-but-lsi-megaraid-controller-does-not-detect-it ) gives an example of how the host system learns about broken disks: Code: 0x00000051 .. Event Description: State change on VD 00/1 from OPTIMAL(3) to DEGRADED(2) Code: 0x00000072 .. Event Description: State change on PD 05(e0xfc/s0) from ONLINE(18) to FAILED(11) (unclean disk broken seems to be shown as:) Code: 0x00000071 .. Event Description: Unexpected sense: PD 05(e0xfc/s0) Path 4433221103000000, CDB: 2e 00 3a 38 1b c7 00 00 01 00, Sense: b/00/00 And this version of the LSI documentation http://hwraid.le-vert.net/raw-attachment/wiki/LSIMegaRAIDSAS/megacli_user_guide.pdf gives a clearer definition of the physical and virtual drive states in "1.4.16 Physical Drive States" and "1.4.17 Virtual Disk States" on pages 1-11 to 1-12. So as we see, a physical drive breaking would * "FAILED" the physical drive * "DEGRADED" the Virtual Drive (that is the logical exported drive) (from "OPTIMAL") So then, it was indeed the card's "event log" that contains this info. Last question then would only be then, *where* FreeBSD's MRSAS driver sends its event log?