Date: Thu, 09 Feb 2012 11:12:06 -0500 From: Mike Tancsa <mike@sentex.net> To: Jeremy Chadwick <freebsd@jdc.parodius.com> Cc: Alexander Motin <mav@FreeBSD.org>, freebsd-stable@FreeBSD.org Subject: Re: siisch1: Error while READ LOG EXT Message-ID: <4F33F056.6070300@sentex.net> In-Reply-To: <20120209152240.GA95470@icarus.home.lan> References: <4F32E289.4080806@sentex.net> <mailpost.1328736521.3202974.81071.mailing.freebsd.stable@FreeBSD.cs.nctu.edu.tw> <4F32F5B0.2060203@FreeBSD.org> <20120208223819.GA27488@icarus.home.lan> <4F32FB5E.7050102@FreeBSD.org> <4F33DB75.1080202@sentex.net> <20120209152240.GA95470@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2/9/2012 10:22 AM, Jeremy Chadwick wrote: > > I have to assume that devices connected on a port multiplier show up on > a separate scbusX number. This is from your original mail: > Based on this, and assuming my understanding of how this setup works -- > and please note I could be wrong, these port multiplier things I have no > familiarity with personally -- but it looks (to me) like this: > > scbus0 > --> Associated with Port Multiplier device pmp1 > --> Disk ada0 > --> Disk ada1 > --> Disk ada2 > --> Disk ada3 Correct. This is the original hardware. It too was showing the odd error prior to adding the new set of disks to expand the zfs pool. e.g. here are some errors on the original PM Feb 4 22:55:02 backup3 kernel: siisch0: Timeout on slot 24 Feb 4 22:55:02 backup3 kernel: siisch0: siis_timeout is 00040000 ss 25002a00 rs 25002a00 es 00000000 sts 80182000 serr 00000000 Feb 4 22:55:02 backup3 kernel: siisch0: ... waiting for slots 24002a00 Feb 4 22:55:02 backup3 kernel: siisch0: Timeout on slot 13 Feb 4 22:55:02 backup3 kernel: siisch0: siis_timeout is 00040000 ss 25002a00 rs 25002a00 es 00000000 sts 80182000 serr 00000000 Feb 4 22:55:02 backup3 kernel: siisch0: ... waiting for slots 24000a00 Feb 4 22:55:02 backup3 kernel: siisch0: Timeout on slot 29 Feb 4 22:55:02 backup3 kernel: siisch0: siis_timeout is 00040000 ss 25002a00 rs 25002a00 es 00000000 sts 80182000 serr 00000000 Feb 4 22:55:02 backup3 kernel: siisch0: ... waiting for slots 04000a00 Feb 4 22:55:02 backup3 kernel: siisch0: Timeout on slot 11 > > scbus1 > --> Associated with Port Multiplier device pmp0 > --> Disk ada4 > --> Disk ada5 > --> Disk ada6 > --> Disk ada7 > --> Disk ada8 Correct, this is the new PM. 4 disks in use, and one spare. > > scbus4 > --> Appeaars to be a Areca controller of some kind, in RAID yes. > --> Disk da0, volume "usrvar" > --> Disk da1, volume "backup1" > > scbus5 > --> Not sure what this thing is 3ware with a pair of faster disks that holds a large DB to slice and dice netflow data. > --> Disk or "thing" da2 > > scbus6 > scbus7 > scbus8 > scbus11 > --> Disk ada12 Disks off the motherboard. > > So which Port Multiplier did you add? The one at scbus0 or scbus1? 1 <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 0 lun 0 (pass5,ada4) <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 1 lun 0 (pass6,ada5) <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 2 lun 0 (pass7,ada6) <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 3 lun 0 (pass8,ada7) <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 4 lun 0 (pass9,ada8) <Port Multiplier 37261095 1706> at scbus1 target 15 lun 0 (pass10,pmp0) > > A full dmesg (not just a snippet) would probably be helpful here. What > you provided in your first post was too terse, especially given how many > disks you have in this system. :-) > > I really see no problem with looking at all disks -- specifically disks > ada0 through ada3, and ada4 through ada8 -- to determine which one may > be having problems. You're welcome to run "smartctl -a" on each one and > put them up on the web, preferably segregated by disk name (e.g. > ada0.txt, ada1.txt, etc.) and I can review them all. Actually, I just had a look at another server at our DR site. Its hardware has not changed in a bit, but I did bring the kernel uptodate. Its now logging the odd 'READ LOG EXT' error as well. Its kernel is from Jan 22. Prior to that kernel update, I had not seen these errors. Something in the driver (ahci or cam layer?) that has changed perhaps ? Feb 4 11:12:36 offsite kernel: siisch1: Error while READ LOG EXT The output is in one giant txt file. But each section has the heading of the disk (for i in `jot 10 0`;do echo "==================== ada$i ==================" >> d.rep; smartctl -x /dev/ada$i >>d.rep;smartctl -l gplog,0x10 /dev/ada$i >> d.rep;done;) http://www.tancsa.com/ahci.txt ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F33F056.6070300>