From owner-freebsd-scsi@FreeBSD.ORG Wed Feb 23 03:58:18 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9DAA106566B; Wed, 23 Feb 2011 03:58:18 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 849E08FC0C; Wed, 23 Feb 2011 03:58:18 +0000 (UTC) Received: from aannecy-552-1-267-197.w92-157.abo.wanadoo.fr ([92.157.235.197] helo=new-host.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1Ps5rk-0005mg-Td; Wed, 23 Feb 2011 04:58:17 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: <20110221214544.GA43886@nargothrond.kdm.org> Date: Wed, 23 Feb 2011 04:58:14 +0100 Content-Transfer-Encoding: 7bit Message-Id: <2E532F21-B969-4216-9765-BC1CC1EAB522@tingvold.com> References: <20110203221056.GA25389@nargothrond.kdm.org> <20110204180011.GA38067@nargothrond.kdm.org> <20110208201310.GA97635@nargothrond.kdm.org> <4A14FA28-6C9E-4F22-B7A3-4295ACD77719@tingvold.com> <20110218171619.GB78796@nargothrond.kdm.org> <318745DD-B5F4-4693-B3F2-22DF8D437349@tingvold.com> <20110221155041.GA37922@nargothrond.kdm.org> <3037190B-6CF2-4C8E-8350-5BA4F13456A8@tingvold.com> <20110221214544.GA43886@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Feb 2011 03:58:19 -0000 On Mon, Feb 21, 2011, at 22:45:44PM GMT+01:00, Kenneth D. Merry wrote: >>> Okay, good. It looks like it is running as designed. >> It is? It still terminating the commands, which I guess it shouldn't? >> >> mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0 > Sorry, I missed that, I was just looking at the first part. No worries. (-: > I'm still waiting for LSI to look at the SAS analyzer trace I sent > them for > the "IOC terminated" bug. > > It appears to be (at least on my hardware) a backend issue of some > sort, > and probably not anything we can fix in the driver. I see. Good to know that you're able to reproduce it, since I with good possibility can rule out that it's a hardware-issue on my controller. > Since you've got an HP branded expander, that makes it a little more > difficult to determine whether it's an LSI, Maxim, or some other > expander. > Can you try the following on your system? You'll need the sg3_utils > port: > > sg_inq -i ses0 > > (I need to update camcontrol to parse page 0x83 output.) > > [...] > > Maxim expanders seem to report LUN descriptors in VPD page 0x83 > instead of > target port descriptors. We might get a slight clue from the > output, but > it's hard to say for certain since HP could have customized the page > 0x83 > values in the expander firmware. VPD INQUIRY: Device Identification page Designation descriptor number 1, descriptor length: 12 transport: Serial Attached SCSI (SAS) designator_type: NAA, code_set: Binary associated with the target port NAA 5, IEEE Company_id: 0x1438 Vendor Specific Identifier: 0x101a2865 [0x50014380101a2865] Designation descriptor number 2, descriptor length: 8 transport: Serial Attached SCSI (SAS) designator_type: Relative target port, code_set: Binary associated with the target port Relative target port: 0x1 >> It just doesn't display the 'out of chain'-errors, that's all I >> think. > > Well, if you don't see the 'out of chain' errors with 2048 chain > buffers, > that means the condition isn't happening. > > The cost of going from 1024 to 2048 is only 32K of extra memory, > which is > not a big deal, so I think I'll go ahead and bump the limit up and > remove > the printfs. We've now proven the recovery strategy, so it'll just > slow > things down slightly if anyone runs into that issue again. Good. It has such a small impact, yes, so it shouldn't trouble anyone. >>> What filesystem are you using by the way? >> ZFS. > Interesting. I haven't been able to run out of chain elements with > ZFS, > but I can use quite a few with UFS. I had to artificially limit the > number > of chain elements to test the change. Maybe it's because the amount of disks in the same pool that I have? Or that I have two un-even raidz2 vdev's in the same pool? The latter has to be forced when adding it to the pool, so I guess it's not an "ideal" solution... (but "everyone" does it, it seems). -- Joachim