Date: Tue, 08 Dec 2015 08:26:24 -0500 From: Michael Jung <mikej@mikej.com> To: prateek sethi <prateekrootkey@gmail.com> Cc: Scott Long <scott4long@yahoo.com>, freebsd-scsi@freebsd.org, owner-freebsd-scsi@freebsd.org Subject: Re: bad disk discovery Message-ID: <a786f32633126a6d898a1d211dc1dba0@mail.mikej.com> In-Reply-To: <CABD8d0oqXoJFEcKzY3cj%2BVv2X%2B5Mg0ceQ%2Bivn3NP%2BHdMdDArRg@mail.gmail.com> References: <CABD8d0rTD2Fu9QsqLKREBcA-nndQrUH8F8DWrPh_KQB64qjy1Q@mail.gmail.com> <6A7832F8-53EB-4641-8EF6-E0E6175EB52D@yahoo.com> <CABD8d0oqXoJFEcKzY3cj%2BVv2X%2B5Mg0ceQ%2Bivn3NP%2BHdMdDArRg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2015-12-08 08:00, prateek sethi wrote: > Hi Scott, > Thanks for the your quick response. > > I have different set of hardware . So that's why I want to know how I > can > debug it myself . Is there anyway or procedure using that I can findout > about the situation or the reason for CDB errors or disk command > failure? > > Right now I am giving detail about the setup where I am getting this > issue . > > I am using LSI SAS2008 controller and connected with supermicro > Enclosure > with freebsd 9.3. 16 different disks are there but only one disk is > having > problem. That means contoller and cable are fine. > > Faulty disk info are like:-. > > *smartctl output is:-* > > smartctl -x /dev/da23 > > === START OF INFORMATION SECTION === > Vendor: SEAGATE > Product: ST3600057SS > Revision: 000B > Rotation Rate: 15000 rpm > Form Factor: 3.5 inches > Logical Unit id: 0x5000c5007725173f > Serial number: 6SL8YLPC0000N5030DY7 > Device type: disk > Transport protocol: SAS > Local Time is: Tue Dec 8 18:20:45 2015 IST > *device is NOT READY (e.g. spun down, busy)* > > *Logs:-* > > Dec 8 14:12:01 N1 kernel: da23 at mps0 bus 0 scbus0 target 148 lun 0 > Dec 8 14:12:01 N1 kernel: da23: <SEAGATE ST3600057SS 000B> Fixed > Direct > Access SCSI-5 device > Dec 8 14:12:01 N1 kernel: da23: Serial Number 6SL8YLPC0000N5030DY7 > Dec 8 14:12:01 N1 kernel: da23: 600.000MB/s transfers > Dec 8 14:12:01 N1 kernel: da23: Command Queueing enabled > Dec 8 14:12:01 N1 kernel: da23: *Attempt to query device size failed: > NOT > READY, Logical unit not ready, cause n* > Dec 8 14:12:01 N1 kernel: ses1: da23,pass26: Element descriptor: 'Slot > 24' > Dec 8 14:12:01 N1 kernel: ses1: da23,pass26: SAS Device Slot Element: > 1 > Phys at Slot 23 > > *driver versions:-* > > dev.mps.0.firmware_version: 15.00.00.00 > dev.mps.0.driver_version: 16.00.00.00-fbsd > > > > > > > On Tue, Dec 8, 2015 at 3:15 AM, Scott Long <scott4long@yahoo.com> > wrote: > >> Hi, >> >> If your situation is accurate and the disk is not responding properly >> to >> regular >> commands then it’s unlikely that it will respond to SMART commands >> either. >> Sometimes these situations are caused by a bad cable, bad controller, >> or >> buggy software/firmware, and only rarely will the standard statistics >> in >> SMART >> pick up these kinds of errors. SMART is better at tracking wear rates >> and >> error rates on the physical media, both HDD and SSD, but even then >> it’s >> hard >> for it to be accurately predictive or even accurately diagnostic. For >> your case, >> I recommend that you describe your hardware and software configuration >> in >> more detail, and look for physical abnormalities in the cabling and >> connections. >> Once that is ruled and and the rest of us know what kind of hardware >> you’re >> dealing with, we might be able to make better commendations. >> >> Scott >> >> > On Dec 7, 2015, at 11:07 AM, prateek sethi <prateekrootkey@gmail.com> >> wrote: >> > >> > Hi , >> > >> > Is there any way or tool to find out that a disk which is not responding >> > properly is really bad or not? Sometimes I have seen that there is lot of >> > CDB error for a drive and system reboot makes every thing fine. What can >> be >> > reasons for such kind of scenarios? >> > >> > I know smartctl is the one which can help. I have some couple of question >> > regarding this . >> > >> > 1. What if disk does not support smartctl? >> > 2. How I can do smartest use of smartctl command like which parameters >> can >> > tell that the disk is actually bad? >> > 3. What other test I can perform to make it sure that disk has completely >> > gone? >> > >> > >> > Please tell me correct place to ask this question if I am asking at wrong >> > place. >> > _______________________________________________ >> > freebsd-scsi@freebsd.org mailing list >> > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >> > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" >> >> > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" Have you simply moved the drive to another slot - does the problem follow the drive? Unlikely but it could be a backplane issue. I don't know about version 15 firmware, I have always used version 16 firmware with 9.x to match the driver version.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a786f32633126a6d898a1d211dc1dba0>