From owner-freebsd-fs@FreeBSD.ORG Wed Apr 13 08:59:40 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 093D71065670 for ; Wed, 13 Apr 2011 08:59:40 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [76.96.30.32]) by mx1.freebsd.org (Postfix) with ESMTP id E4BC68FC14 for ; Wed, 13 Apr 2011 08:59:39 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta03.emeryville.ca.mail.comcast.net with comcast id WwwS1g0021smiN4A3wzfJH; Wed, 13 Apr 2011 08:59:39 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta20.emeryville.ca.mail.comcast.net with comcast id Wwze1g00A1t3BNj8gwzepk; Wed, 13 Apr 2011 08:59:39 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 496AF9B42B; Wed, 13 Apr 2011 01:59:38 -0700 (PDT) Date: Wed, 13 Apr 2011 01:59:38 -0700 From: Jeremy Chadwick To: Mailing Lists Message-ID: <20110413085938.GA51187@icarus.home.lan> References: <202b7a73-bc0f-4c58-9c2a-8c9c42e66394@zcs01> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: siisch0: Timeout on slot 30 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Apr 2011 08:59:40 -0000 On Wed, Apr 13, 2011 at 08:33:48AM +0100, Mailing Lists wrote: > Morning All, > > > I see there has been a few threads relating to something similar, specifically around port multiplier time outs using the SIIS module, with a patch even provided at one point - however this patch doesnt appear to be valid any more, 2 of the 3 chunks are already in the 8.2 release and from checking the cvs logs; when i try to apply, avoiding the roll back, 1 chunk does apply. I think the patch does make some difference, in that it doesnt make the disks stall for as long as frequently but the errors still appear in the logs. > > Currently running FreeBSD 8.2-RELEASE, with the zfs v28 patch. The main chassis has a Silicon Image 3124 in, with an estata port - connected to the esata port is another disk shelf with 5 disks in, all in one raidz pool. When i run a scrub on the external esata shelf, i see time outs in my logs: > > > > siisch0: Error while READ LOG EXT > siisch0: Error while READ LOG EXT > siisch0: Timeout on slot 30 > siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 > siisch0: ... waiting for slots 04440000 > siisch0: Timeout on slot 26 > siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 > siisch0: ... waiting for slots 00440000 > siisch0: Timeout on slot 22 > siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 > siisch0: ... waiting for slots 00040000 > siisch0: Timeout on slot 18 > siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 > siisch0: Error while READ LOG EXT > siisch0: Error while READ LOG EXT > > > siisch0 is the port multiplier , the slot number seems crazy to me (as i dont have 30 slots, theres only 5 in the external shelf) - although im not sure if this references something else. I dont believe this is a hardware problem, ive tried with replacement devices/controller cards, still see the above. > > Anyone else seeing this or have any thoughts? I cannot help you with your problem, but I can help reduce your confusion regarding "slot numbers seeming crazy". Based on a analysis of the src/sys/dev/siis code, there is no 1:1 relation between a slot number and a disk/port/drive-bay/device. "Slot" in this context means something completely different; do not correlate the two things[1]. I don't know what "slot" means in this context; mav@ will know for certain. Take a look at src/sys/dev/siis/siis.h, specifically the "struct siis_channel" structure. You'll see there are multiple "slots" (declared as struct siis_slot) per channel. Each slot appears to have its own identification number -- in the kernel printf(), it's referred to as slot->slot. Thus: channel->slot[X]->slot Your controller has multiple channels -- and each channel can have up to 256 (0-255) slots. Each slot has a separate DMA channel associated with it, as well as a separate CCB, and a separate timeout value. There can be up to 256 (0-255) "slots" assigned to an individual SATA controller channel. A channel on SATA usually correlates (1:1) to a disk, but I don't know how port multipliers fit into the picture (I do on a physical level, just not on a software level). Controller channels can also have their own DMA channel (see [1] again). This is pretty cool from a performance perspective; I'd never looked at siis until now. Anyway, point being: slot != SATA port. Hopefully that relieves your concern there. [1]: Anyone working with technology needs to accept the fact that there are too many words in the English language that are synonyms for "thing". Common tech terms for such: index, slot, port, channel, bay, tag, port, bus, volume, and even LUN (yes, as in SCSI LUN). I'm forgetting some commonly-used others. When these terms are used with someone who lacks context of what the term actually refers to (on a technical/software/hardware level), confusion guaranteed. Am I recommending the printf()s be changed? Absolutely not. Just know that slot != SATA port. Regarding LUN: I still see people correlating (1:1) a LUN with a disk. When you introduce these people to SANs, where a LUN often consists of multiple devices (commonly disks), confusion is guaranteed. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |