Date: Thu, 9 Feb 2012 07:22:40 -0800 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Mike Tancsa <mike@sentex.net> Cc: Alexander Motin <mav@FreeBSD.org>, freebsd-stable@FreeBSD.org Subject: Re: siisch1: Error while READ LOG EXT Message-ID: <20120209152240.GA95470@icarus.home.lan> In-Reply-To: <4F33DB75.1080202@sentex.net> References: <4F32E289.4080806@sentex.net> <mailpost.1328736521.3202974.81071.mailing.freebsd.stable@FreeBSD.cs.nctu.edu.tw> <4F32F5B0.2060203@FreeBSD.org> <20120208223819.GA27488@icarus.home.lan> <4F32FB5E.7050102@FreeBSD.org> <4F33DB75.1080202@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 09, 2012 at 09:43:01AM -0500, Mike Tancsa wrote: > On 2/8/2012 5:46 PM, Alexander Motin wrote: > > > > READ LOG EXT for NCQ, same as REQUEST SENSE for ATAPI sent by every > > specific controller driver. In this case by siis_issue_recovery() > > function in dev/siis/siis.c. In case of proper READ LOG EXT completion, > > fetched status returned to CAM together with original command. > > Hi, > Is there a way to find out which drive is causing these errors ? > Looking at the logs on the various drives, they all seem to have the odd > non zero value. I suspect it might be a Segate Disk as smartctl flags > it as having bad firmware issues > > > === START OF INFORMATION SECTION === > Model Family: Seagate Barracuda 7200.11 > Device Model: ST31000333AS > Serial Number: 9TE14SRV > LU WWN Device Id: 5 000c50 010a39664 > Firmware Version: SD35 > User Capacity: 1,000,204,886,016 bytes [1.00 TB] > Sector Size: 512 bytes logical/physical > Device is: In smartctl database [for details use: -P show] > ATA Version is: 8 > ATA Standard is: ATA-8-ACS revision 4 > Local Time is: Thu Feb 9 09:40:56 2012 EST > > ==> WARNING: There are known problems with these drives, > see the following Seagate web pages: > http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931 > http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207951 > http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207957 The URLs listed are for firmware-level problems with this model of Seagate drive. This is a very famous firmware issue and got a lot of media attention. The bugs with that firmware, however, would not appear as what you are seeing. You stated in your original mail that you "added a port multiplier" then started getting these errors. You then provided SMART output of /dev/ada9, so I made the assumption you had managed to figure out what device was causing the problem. I have to assume that devices connected on a port multiplier show up on a separate scbusX number. This is from your original mail: > # camcontrol devlist > <WDC WD2001FASS-00U0B0 01.00101> at scbus0 target 0 lun 0 (pass0,ada0) > <WDC WD2001FASS-00U0B0 01.00101> at scbus0 target 1 lun 0 (pass1,ada1) > <WDC WD2001FASS-00U0B0 01.00101> at scbus0 target 2 lun 0 (pass2,ada2) > <WDC WD2001FASS-00U0B0 01.00101> at scbus0 target 3 lun 0 (pass3,ada3) > <Port Multiplier 47261095 1f06> at scbus0 target 15 lun 0 (pass4,pmp1) > <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 0 lun 0 (pass5,ada4) > <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 1 lun 0 (pass6,ada5) > <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 2 lun 0 (pass7,ada6) > <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 3 lun 0 (pass8,ada7) > <WDC WD2002FAEX-007BA0 05.01D05> at scbus1 target 4 lun 0 (pass9,ada8) > <Port Multiplier 37261095 1706> at scbus1 target 15 lun 0 (pass10,pmp0) > <Areca usrvar R001> at scbus4 target 0 lun 0 (pass11,da0) > <Areca backup1 R001> at scbus4 target 0 lun 1 (pass12,da1) > <Areca RAID controller R001> at scbus4 target 16 lun 0 (pass13) > <AMCC 9650SE-2LP DISK 4.10> at scbus5 target 0 lun 0 (pass14,da2) > <ST31000333AS SD35> at scbus6 target 0 lun 0 (pass15,ada9) > <ST31000528AS CC35> at scbus7 target 0 lun 0 (pass16,ada10) > <ST31000340AS SD1A> at scbus8 target 0 lun 0 (pass17,ada11) > <WDC WD1002FAEX-00Z3A0 05.01D05> at scbus11 target 0 lun 0 (pass18,ada12) Based on this, and assuming my understanding of how this setup works -- and please note I could be wrong, these port multiplier things I have no familiarity with personally -- but it looks (to me) like this: scbus0 --> Associated with Port Multiplier device pmp1 --> Disk ada0 --> Disk ada1 --> Disk ada2 --> Disk ada3 scbus1 --> Associated with Port Multiplier device pmp0 --> Disk ada4 --> Disk ada5 --> Disk ada6 --> Disk ada7 --> Disk ada8 scbus4 --> Appeaars to be a Areca controller of some kind, in RAID --> Disk da0, volume "usrvar" --> Disk da1, volume "backup1" scbus5 --> Not sure what this thing is --> Disk or "thing" da2 scbus6 --> Disk ada9 scbus7 --> Disk ada10 scbus8 --> Disk ada11 scbus11 --> Disk ada12 So which Port Multiplier did you add? The one at scbus0 or scbus1? A full dmesg (not just a snippet) would probably be helpful here. What you provided in your first post was too terse, especially given how many disks you have in this system. :-) I really see no problem with looking at all disks -- specifically disks ada0 through ada3, and ada4 through ada8 -- to determine which one may be having problems. You're welcome to run "smartctl -a" on each one and put them up on the web, preferably segregated by disk name (e.g. ada0.txt, ada1.txt, etc.) and I can review them all. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120209152240.GA95470>