From owner-freebsd-scsi@FreeBSD.ORG  Tue Feb  8 20:13:12 2011
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 70267106566C;
	Tue,  8 Feb 2011 20:13:12 +0000 (UTC) (envelope-from ken@kdm.org)
Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81])
	by mx1.freebsd.org (Postfix) with ESMTP id 19C408FC0A;
	Tue,  8 Feb 2011 20:13:11 +0000 (UTC)
Received: from nargothrond.kdm.org (localhost [127.0.0.1])
	by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p18KDBZu097735;
	Tue, 8 Feb 2011 13:13:11 -0700 (MST)
	(envelope-from ken@nargothrond.kdm.org)
Received: (from ken@localhost)
	by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p18KDAgo097734;
	Tue, 8 Feb 2011 13:13:10 -0700 (MST) (envelope-from ken)
Date: Tue, 8 Feb 2011 13:13:10 -0700
From: "Kenneth D. Merry" <ken@freebsd.org>
To: Joachim Tingvold <joachim@tingvold.com>
Message-ID: <20110208201310.GA97635@nargothrond.kdm.org>
References: <20110114001758.GA12793@nargothrond.kdm.org>
	<D24332F3-56AF-484C-9592-1097BF684E37@tingvold.com>
	<07392102-4584-4690-9188-5202728CC7CA@tingvold.com>
	<20110120155746.GA22515@nargothrond.kdm.org>
	<BC40CE83-6116-49CD-8D37-5BC29893449D@tingvold.com>
	<070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com>
	<20110203221056.GA25389@nargothrond.kdm.org>
	<FFF5EF18-055C-4E0C-8F9B-03564217F80F@tingvold.com>
	<20110204180011.GA38067@nargothrond.kdm.org>
	<DE11FC96-06DB-479F-8673-B9ACE2805390@tingvold.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <DE11FC96-06DB-479F-8673-B9ACE2805390@tingvold.com>
User-Agent: Mutt/1.4.2i
Cc: freebsd-scsi@freebsd.org, Alexander Motin <mav@freebsd.org>
Subject: Re: mps0-troubles
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Feb 2011 20:13:12 -0000

On Tue, Feb 08, 2011 at 02:35:35 +0100, Joachim Tingvold wrote:
> On Fri, Feb 04, 2011, at 19:00:11PM GMT+01:00, Kenneth D. Merry wrote:
> >Perhaps it could depend on memory fragmentation somewhat.  Over time  
> >you
> >may see the low water mark go down a bit.
> >
> >The good news is that it doesn't look like we have a leak.
> 
> <http://home.komsys.org/~jocke/dmesg_mps0_freebsd-scsi_5.txt>
> 

This particular error is interesting:

mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0
mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0

It means that the chip terminated the command for some reason.  I have been
talking to LSI about it.  I'm working on getting an analyzer trace when it
happens, so I cn send that to LSI.

What kind of expander do you have in your system?  How many expanders do
you have?  How many drives do you have?  Can you send 'camcontrol devlist
-v' output?

> [jocke@filserver ~]$ sysctl hw.mps.0
> hw.mps.0.debug_level: 0
> hw.mps.0.allow_multiple_tm_cmds: 0
> hw.mps.0.io_cmds_active: 1
> hw.mps.0.io_cmds_highwater: 959
> hw.mps.0.chain_free: 2048
> hw.mps.0.chain_free_lowwater: 1721
> hw.mps.0.chain_alloc_fail: 0
> 
> This time I did a recursive copy of a folder with no large files at  
> all (it contained only small documents), from 'storage' to 'storage'.
> 
> However, it recovered, so the copy just continued where it left of --  
> which is a change from previous crashes.

Yes, it looks like we're not running into the out of chain problem.

The timeouts could be due to all sorts of problems.  The IOC terminated
errors I'm still not sure about.  I need to get a trace and send that along
with a diagnostic ring buffer dump from the card to LSI to get some answers
about what is going on.

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG