From owner-freebsd-fs@FreeBSD.ORG  Wed Apr 13 08:59:40 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 093D71065670
	for <freebsd-fs@freebsd.org>; Wed, 13 Apr 2011 08:59:40 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta03.emeryville.ca.mail.comcast.net
	(qmta03.emeryville.ca.mail.comcast.net [76.96.30.32])
	by mx1.freebsd.org (Postfix) with ESMTP id E4BC68FC14
	for <freebsd-fs@freebsd.org>; Wed, 13 Apr 2011 08:59:39 +0000 (UTC)
Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87])
	by qmta03.emeryville.ca.mail.comcast.net with comcast
	id WwwS1g0021smiN4A3wzfJH; Wed, 13 Apr 2011 08:59:39 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta20.emeryville.ca.mail.comcast.net with comcast
	id Wwze1g00A1t3BNj8gwzepk; Wed, 13 Apr 2011 08:59:39 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 496AF9B42B; Wed, 13 Apr 2011 01:59:38 -0700 (PDT)
Date: Wed, 13 Apr 2011 01:59:38 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Mailing Lists <mailing.lists@streamvia.net>
Message-ID: <20110413085938.GA51187@icarus.home.lan>
References: <202b7a73-bc0f-4c58-9c2a-8c9c42e66394@zcs01>
	<fdef76ed-89ad-417b-a67f-1f4d5e393e8e@zcs01>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <fdef76ed-89ad-417b-a67f-1f4d5e393e8e@zcs01>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
Subject: Re: siisch0: Timeout on slot 30
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Apr 2011 08:59:40 -0000

On Wed, Apr 13, 2011 at 08:33:48AM +0100, Mailing Lists wrote:
> Morning All, 
> 
> 
> I see there has been a few threads relating to something similar, specifically around port multiplier time outs using the SIIS module, with a patch even provided at one point - however this patch doesnt appear to be valid any more, 2 of the 3 chunks are already in the 8.2 release and from checking the cvs logs; when i try to apply, avoiding the roll back, 1 chunk does apply. I think the patch does make some difference, in that it doesnt make the disks stall for as long as frequently but the errors still appear in the logs. 
> 
> Currently running FreeBSD 8.2-RELEASE, with the zfs v28 patch. The main chassis has a Silicon Image 3124 in, with an estata port - connected to the esata port is another disk shelf with 5 disks in, all in one raidz pool. When i run a scrub on the external esata shelf, i see time outs in my logs: 
> 
> 
> 
> siisch0: Error while READ LOG EXT 
> siisch0: Error while READ LOG EXT 
> siisch0: Timeout on slot 30 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: ... waiting for slots 04440000 
> siisch0: Timeout on slot 26 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: ... waiting for slots 00440000 
> siisch0: Timeout on slot 22 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: ... waiting for slots 00040000 
> siisch0: Timeout on slot 18 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: Error while READ LOG EXT 
> siisch0: Error while READ LOG EXT 
> 
> 
> siisch0 is the port multiplier , the slot number seems crazy to me (as i dont have 30 slots, theres only 5 in the external shelf) - although im not sure if this references something else. I dont believe this is a hardware problem, ive tried with replacement devices/controller cards, still see the above. 
> 
> Anyone else seeing this or have any thoughts? 

I cannot help you with your problem, but I can help reduce your
confusion regarding "slot numbers seeming crazy".

Based on a analysis of the src/sys/dev/siis code, there is no 1:1
relation between a slot number and a disk/port/drive-bay/device.  "Slot"
in this context means something completely different; do not correlate
the two things[1].  I don't know what "slot" means in this context; mav@
will know for certain.

Take a look at src/sys/dev/siis/siis.h, specifically the "struct
siis_channel" structure.  You'll see there are multiple "slots"
(declared as struct siis_slot) per channel.  Each slot appears to have
its own identification number -- in the kernel printf(), it's referred
to as slot->slot.  Thus: channel->slot[X]->slot

Your controller has multiple channels -- and each channel can have up to
256 (0-255) slots.  Each slot has a separate DMA channel associated with
it, as well as a separate CCB, and a separate timeout value.

There can be up to 256 (0-255) "slots" assigned to an individual SATA
controller channel.  A channel on SATA usually correlates (1:1) to a
disk, but I don't know how port multipliers fit into the picture (I do
on a physical level, just not on a software level).  Controller channels
can also have their own DMA channel (see [1] again).

This is pretty cool from a performance perspective; I'd never looked at
siis until now.

Anyway, point being: slot != SATA port.  Hopefully that relieves your
concern there.


[1]: Anyone working with technology needs to accept the fact that there
are too many words in the English language that are synonyms for
"thing".  Common tech terms for such: index, slot, port, channel, bay,
tag, port, bus, volume, and even LUN (yes, as in SCSI LUN).  I'm
forgetting some commonly-used others.

When these terms are used with someone who lacks context of what the
term actually refers to (on a technical/software/hardware level),
confusion guaranteed.  Am I recommending the printf()s be changed?
Absolutely not.  Just know that slot != SATA port.

Regarding LUN: I still see people correlating (1:1) a LUN with a disk.
When you introduce these people to SANs, where a LUN often consists of
multiple devices (commonly disks), confusion is guaranteed.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |