Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 09 Feb 2012 11:12:06 -0500
From:      Mike Tancsa <mike@sentex.net>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        Alexander Motin <mav@FreeBSD.org>, freebsd-stable@FreeBSD.org
Subject:   Re: siisch1: Error while READ LOG EXT
Message-ID:  <4F33F056.6070300@sentex.net>
In-Reply-To: <20120209152240.GA95470@icarus.home.lan>
References:  <4F32E289.4080806@sentex.net> <mailpost.1328736521.3202974.81071.mailing.freebsd.stable@FreeBSD.cs.nctu.edu.tw> <4F32F5B0.2060203@FreeBSD.org> <20120208223819.GA27488@icarus.home.lan> <4F32FB5E.7050102@FreeBSD.org> <4F33DB75.1080202@sentex.net> <20120209152240.GA95470@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2/9/2012 10:22 AM, Jeremy Chadwick wrote:
> 
> I have to assume that devices connected on a port multiplier show up on
> a separate scbusX number.  This is from your original mail:

> Based on this, and assuming my understanding of how this setup works --
> and please note I could be wrong, these port multiplier things I have no
> familiarity with personally -- but it looks (to me) like this:
> 
> scbus0
>   --> Associated with Port Multiplier device pmp1
>       --> Disk ada0
>       --> Disk ada1
>       --> Disk ada2
>       --> Disk ada3

Correct. This is the original hardware.  It too was showing the odd
error prior to adding the new set of disks to expand the zfs pool.  e.g.
here are some errors on the original PM

Feb  4 22:55:02 backup3 kernel: siisch0: Timeout on slot 24
Feb  4 22:55:02 backup3 kernel: siisch0: siis_timeout is 00040000 ss
25002a00 rs 25002a00 es 00000000 sts 80182000 serr 00000000
Feb  4 22:55:02 backup3 kernel: siisch0:  ... waiting for slots 24002a00
Feb  4 22:55:02 backup3 kernel: siisch0: Timeout on slot 13
Feb  4 22:55:02 backup3 kernel: siisch0: siis_timeout is 00040000 ss
25002a00 rs 25002a00 es 00000000 sts 80182000 serr 00000000
Feb  4 22:55:02 backup3 kernel: siisch0:  ... waiting for slots 24000a00
Feb  4 22:55:02 backup3 kernel: siisch0: Timeout on slot 29
Feb  4 22:55:02 backup3 kernel: siisch0: siis_timeout is 00040000 ss
25002a00 rs 25002a00 es 00000000 sts 80182000 serr 00000000
Feb  4 22:55:02 backup3 kernel: siisch0:  ... waiting for slots 04000a00
Feb  4 22:55:02 backup3 kernel: siisch0: Timeout on slot 11


> 
> scbus1
>   --> Associated with Port Multiplier device pmp0
>       --> Disk ada4
>       --> Disk ada5
>       --> Disk ada6
>       --> Disk ada7
>       --> Disk ada8

Correct, this is the new PM. 4 disks in use, and one spare.

> 
> scbus4
>   --> Appeaars to be a Areca controller of some kind, in RAID

yes.

>       --> Disk da0, volume "usrvar" 
>       --> Disk da1, volume "backup1"
> 
> scbus5
>   --> Not sure what this thing is

3ware with a pair of faster disks that holds a large DB to slice and
dice netflow data.

>       --> Disk or "thing" da2
> 
> scbus6
> scbus7
> scbus8
> scbus11
>   --> Disk ada12

Disks off the motherboard.

> 
> So which Port Multiplier did you add?  The one at scbus0 or scbus1?

1
<WDC WD2002FAEX-007BA0 05.01D05>   at scbus1 target 0 lun 0 (pass5,ada4)
<WDC WD2002FAEX-007BA0 05.01D05>   at scbus1 target 1 lun 0 (pass6,ada5)
<WDC WD2002FAEX-007BA0 05.01D05>   at scbus1 target 2 lun 0 (pass7,ada6)
<WDC WD2002FAEX-007BA0 05.01D05>   at scbus1 target 3 lun 0 (pass8,ada7)
<WDC WD2002FAEX-007BA0 05.01D05>   at scbus1 target 4 lun 0 (pass9,ada8)
<Port Multiplier 37261095 1706>    at scbus1 target 15 lun 0 (pass10,pmp0)





> 
> A full dmesg (not just a snippet) would probably be helpful here.  What
> you provided in your first post was too terse, especially given how many
> disks you have in this system.  :-)
> 
> I really see no problem with looking at all disks -- specifically disks
> ada0 through ada3, and ada4 through ada8 -- to determine which one may
> be having problems.  You're welcome to run "smartctl -a" on each one and
> put them up on the web, preferably segregated by disk name (e.g.
> ada0.txt, ada1.txt, etc.) and I can review them all.

Actually, I just had a look at another server at our DR site. Its
hardware has not changed in a bit, but I did bring the kernel uptodate.
Its now logging the odd 'READ LOG EXT' error as well.  Its kernel is
from Jan 22.  Prior to that kernel update, I had not seen these errors.
 Something in the driver (ahci or cam layer?) that has changed perhaps ?

Feb  4 11:12:36 offsite kernel: siisch1: Error while READ LOG EXT

The output is in one giant txt file.  But each section has the heading
of the disk (for i in `jot 10 0`;do echo "==================== ada$i
==================" >> d.rep; smartctl -x /dev/ada$i >>d.rep;smartctl -l
gplog,0x10 /dev/ada$i >> d.rep;done;)



http://www.tancsa.com/ahci.txt


	---Mike







-- 
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, mike@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F33F056.6070300>