From owner-freebsd-scsi@FreeBSD.ORG Sat Jun 25 01:43:13 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A0A3106564A; Sat, 25 Jun 2011 01:43:13 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 15DD38FC1D; Sat, 25 Jun 2011 01:43:12 +0000 (UTC) Received: from aannecy-552-1-282-248.w83-201.abo.wanadoo.fr ([83.201.250.248] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1QaHhp-0000aj-Ev; Sat, 25 Jun 2011 03:30:42 +0200 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: <20110225183351.GA31590@nargothrond.kdm.org> Date: Sat, 25 Jun 2011 03:30:37 +0200 Content-Transfer-Encoding: 7bit Message-Id: <8AFCE2C0-A87D-414B-912F-C80C158B6D94@tingvold.com> References: <20110204180011.GA38067@nargothrond.kdm.org> <20110208201310.GA97635@nargothrond.kdm.org> <4A14FA28-6C9E-4F22-B7A3-4295ACD77719@tingvold.com> <20110218171619.GB78796@nargothrond.kdm.org> <318745DD-B5F4-4693-B3F2-22DF8D437349@tingvold.com> <20110221155041.GA37922@nargothrond.kdm.org> <3037190B-6CF2-4C8E-8350-5BA4F13456A8@tingvold.com> <20110221214544.GA43886@nargothrond.kdm.org> <2E532F21-B969-4216-9765-BC1CC1EAB522@tingvold.com> <20110225183351.GA31590@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jun 2011 01:43:13 -0000 On Fri, Feb 25, 2011, at 19:33:51PM GMT+01:00, Kenneth D. Merry wrote: > I just checked the change into -current, I'll merge it to -stable > next week. I'm back! Missed me? :-D After running fine for a while, I decided to do some more testing. Usual 'dd' in a while-loop over the night, and woke up to this; ### mps0: (0:39:0) terminated ioc 804b scsi 0 state c xfer 65536 mps0: (0:39:0) terminated ioc 804b scsi 0 state c xfer 65536 mps0: (0:39:0) terminated ioc 804b scsi 0 state c xfer 65536 mps0: (0:39:0) terminated ioc 804b scsi 0 state c xfer 65536 mps0: (0:39:0) terminated ioc 804b scsi 0 state c xfer 0 mps0: (0:39:0) terminated ioc 804b scsi 0 state 0 xfer 0 mps0: (0:39:0) terminated ioc 804b scsi 0 state 0 xfer 0 mps0: (0:39:0) terminated ioc 804b scsi 0 state 0 xfer 0 mps0: (0:39:0) terminated ioc 804b scsi 0 state 0 xfer 0 mps0: mpssas_remove_complete on target 0x0027, IOCStatus= 0x0 (da7:mps0:0:39:0): lost device (da7:mps0:0:39:0): Invalidating pack (da7:mps0:0:39:0): Invalidating pack (da7:mps0:0:39:0): Invalidating pack (da7:mps0:0:39:0): Invalidating pack (da7:mps0:0:39:0): Synchronize cache failed, status == 0xa, scsi status == 0x0 (da7:mps0:0:39:0): removing device entry da7 at mps0 bus 0 scbus0 target 39 lun 0 da7: Fixed Direct Access SCSI-5 device da7: 300.000MB/s transfers da7: Command Queueing enabled da7: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) ### Now, the disk was present at the time I checked, as camcontrol confirms; [root@filserver /storage/tmp]# camcontrol devlist|grep da7 at scbus0 target 39 lun 0 (pass8,da7) However, the disk was marked as "REMOVED" by 'zpool status'; ### [jocke@filserver /storage/tmp]$ zpool status pool: storage state: DEGRADED NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10 ONLINE 0 0 0 da11 ONLINE 0 0 0 da15 ONLINE 0 0 0 da16 ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 REMOVED 0 0 0 da12 ONLINE 0 0 0 da13 ONLINE 0 0 0 spares da14 AVAIL ### A quick 'zpool online storage da7' works fine, as suspected, and pool is resilvering at the moment. I find it a bit worrisome that a disk was removed like that. It _could_ be that the disk isn't completely good, however, due to my previous experiences with mps, I suspect the disk is fine (smartctl- readouts on the disk seems to be good as well). -- Joachim