From owner-freebsd-stable@freebsd.org Sat Oct 17 15:30:17 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3395DA17FF0 for ; Sat, 17 Oct 2015 15:30:17 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "raven.bwct.de", Issuer "BWCT" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 50AF09ED for ; Sat, 17 Oct 2015 15:30:15 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from mail.cicely.de ([10.1.1.37]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id t9HFTplc049340 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Sat, 17 Oct 2015 17:29:56 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by mail.cicely.de (8.14.5/8.14.4) with ESMTP id t9HFTmRH030482 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 17 Oct 2015 17:29:48 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id t9HFTmfp063006; Sat, 17 Oct 2015 17:29:48 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id t9HFTmcE063005; Sat, 17 Oct 2015 17:29:48 +0200 (CEST) (envelope-from ticso) Date: Sat, 17 Oct 2015 17:29:48 +0200 From: Bernd Walter To: freebsd-stable@freebsd.org Cc: Bernd Walter Subject: mfi0: I/O error on MegaRAID SAS 9361-4i Message-ID: <20151017152947.GD56791@cicely7.cicely.de> Reply-To: ticso@cicely.de Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED=-1, BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01 autolearn=ham version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on spamd.cicely.de X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Oct 2015 15:30:17 -0000 System ist running ZFS data pool on 12 disk JBOD using MegaRAID SAS 9361-4i controller. After some time it starts printing the following errors: Oct 16 22:26:46 hostname kernel: mfi0: I/O error, cmd=0xfffffe0001079c90, status=0x3c, scsi_status=0 Oct 16 22:26:46 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:46 hostname kernel: mfisyspd6: hard error cmd=write 410262844-410263362 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe0001077188, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd5: hard error cmd=write 410267090-410267593 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe00010780f0, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd6: hard error cmd=write 410267090-410267593 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe00010778f8, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd7: hard error cmd=write 410267090-410267593 Oct 16 22:26:49 hostname kernel: Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe000107a048, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd7: hard error cmd=write 410267090-410267593 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe0001078db0, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd6: hard error cmd=write 410267090-410267593 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe00010784a8, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd5: hard error cmd=write 410267090-410267593 Oct 16 22:26:49 hostname kernel: mfi0: Oct 16 22:26:49 hostname kernel: I/O error, cmd=0xfffffe0001078750, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd10: hard error cmd=write 362223336-362223832 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe0001076660, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd11: hard error cmd=write 362223336-362223831 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe00010774b8, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd10: hard error cmd=write 362223336-362223832 Oct 16 22:26:49 hostname kernel: mfi0: I/O error, cmd=0xfffffe00010788e8, status=0x3c, scsi_status=0 Oct 16 22:26:49 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:49 hostname kernel: mfisyspd11: hard error cmd=write 362223336-362223831 Oct 16 22:26:59 hostname kernel: mfi0: I/O error, cmd=0xfffffe0001076f68, status=0x3c, scsi_status=0 Oct 16 22:26:59 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:59 hostname kernel: mfisyspd3: hard error cmd=write 295491976-295492474 Oct 16 22:26:59 hostname kernel: mfi0: I/O error, cmd=0xfffffe00010797c8, status=0x3c, scsi_status=0 Oct 16 22:26:59 hostname kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0 Oct 16 22:26:59 hostname kernel: mfisyspd0: hard error cmd=write 295491977-295492475 [...] continues endless [...] Interruptload in MFI is high and gstat shows disk load, but there is no ZFS progress anymore. I can only log into the machine because / is running on UFS. After switching to mrsas things are different. I got the follogin messages: ses0: da0,pass1: Element descriptor: 'Slot00' ses0: da0,pass1: SAS Device Slot Element: 1 Phys at Slot 0 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe080 ses0: da7,pass8: Element descriptor: 'Slot01' ses0: da7,pass8: SAS Device Slot Element: 1 Phys at Slot 1 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe081 ses0: da1,pass2: Element descriptor: 'Slot02' ses0: da1,pass2: SAS Device Slot Element: 1 Phys at Slot 2 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe082 ses0: da6,pass7: Element descriptor: 'Slot03' ses0: da6,pass7: SAS Device Slot Element: 1 Phys at Slot 3 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe083 ses0: da3,pass4: Element descriptor: 'Slot04' ses0: da3,pass4: SAS Device Slot Element: 1 Phys at Slot 4 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe084 ses0: da9,pass10: Element descriptor: 'Slot05' ses0: da9,pass10: SAS Device Slot Element: 1 Phys at Slot 5 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe085 ses0: da2,pass3: Element descriptor: 'Slot06' ses0: da2,pass3: SAS Device Slot Element: 1 Phys at Slot 6 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe086 ses0: da8,pass9: Element descriptor: 'Slot07' ses0: da8,pass9: SAS Device Slot Element: 1 Phys at Slot 7 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe087 ses0: da5,pass6: Element descriptor: 'Slot08' ses0: da5,pass6: SAS Device Slot Element: 1 Phys at Slot 8 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe088 ses0: da10,pass11: Element descriptor: 'Slot09' ses0: da10,pass11: SAS Device Slot Element: 1 Phys at Slot 9 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe089 ses0: da4,pass5: Element descriptor: 'Slot10' ses0: da4,pass5: SAS Device Slot Element: 1 Phys at Slot 10 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe08a ses0: da11,pass12: Element descriptor: 'Slot11' ses0: da11,pass12: SAS Device Slot Element: 1 Phys at Slot 11 ses0: phy 0: SATA device ses0: phy 0: parent 5003048001abe0bf addr 5003048001abe08b I have no idea what they mean and if they point to a real problem, but everything continues just normaly. I don't even know if those are the events that got the MFI driver into permanent troubles. ZFS is still happy with all the drives after those log messages. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.