Date: Tue, 8 Dec 2015 21:35:21 +0530 From: prateek sethi <prateekrootkey@gmail.com> To: Michael Jung <mikej@mikej.com> Cc: Scott Long <scott4long@yahoo.com>, freebsd-scsi@freebsd.org, owner-freebsd-scsi@freebsd.org Subject: Re: bad disk discovery Message-ID: <CABD8d0pH_FTDvDEOA8=-RsuwzgsrABSKCKZTZ=hDM7HBCZ_ppQ@mail.gmail.com> In-Reply-To: <a786f32633126a6d898a1d211dc1dba0@mail.mikej.com> References: <CABD8d0rTD2Fu9QsqLKREBcA-nndQrUH8F8DWrPh_KQB64qjy1Q@mail.gmail.com> <6A7832F8-53EB-4641-8EF6-E0E6175EB52D@yahoo.com> <CABD8d0oqXoJFEcKzY3cj%2BVv2X%2B5Mg0ceQ%2Bivn3NP%2BHdMdDArRg@mail.gmail.com> <a786f32633126a6d898a1d211dc1dba0@mail.mikej.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Yes, I have tried that one but issue is still there. Other disk are working fine with the same configuration that means firmware should not be a problem. On Tue, Dec 8, 2015 at 6:56 PM, Michael Jung <mikej@mikej.com> wrote: > On 2015-12-08 08:00, prateek sethi wrote: > >> Hi Scott, >> Thanks for the your quick response. >> >> I have different set of hardware . So that's why I want to know how I ca= n >> debug it myself . Is there anyway or procedure using that I can findout >> about the situation or the reason for CDB errors or disk command failure= ? >> >> Right now I am giving detail about the setup where I am getting this >> issue . >> >> I am using LSI SAS2008 controller and connected with supermicro Enclosur= e >> with freebsd 9.3. 16 different disks are there but only one disk is havi= ng >> problem. That means contoller and cable are fine. >> >> Faulty disk info are like:-. >> >> *smartctl output is:-* >> >> smartctl -x /dev/da23 >> >> =3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D >> Vendor: SEAGATE >> Product: ST3600057SS >> Revision: 000B >> Rotation Rate: 15000 rpm >> Form Factor: 3.5 inches >> Logical Unit id: 0x5000c5007725173f >> Serial number: 6SL8YLPC0000N5030DY7 >> Device type: disk >> Transport protocol: SAS >> Local Time is: Tue Dec 8 18:20:45 2015 IST >> *device is NOT READY (e.g. spun down, busy)* >> >> *Logs:-* >> >> Dec 8 14:12:01 N1 kernel: da23 at mps0 bus 0 scbus0 target 148 lun 0 >> Dec 8 14:12:01 N1 kernel: da23: <SEAGATE ST3600057SS 000B> Fixed Direct >> Access SCSI-5 device >> Dec 8 14:12:01 N1 kernel: da23: Serial Number 6SL8YLPC0000N5030DY7 >> Dec 8 14:12:01 N1 kernel: da23: 600.000MB/s transfers >> Dec 8 14:12:01 N1 kernel: da23: Command Queueing enabled >> Dec 8 14:12:01 N1 kernel: da23: *Attempt to query device size failed: N= OT >> READY, Logical unit not ready, cause n* >> Dec 8 14:12:01 N1 kernel: ses1: da23,pass26: Element descriptor: 'Slot >> 24' >> Dec 8 14:12:01 N1 kernel: ses1: da23,pass26: SAS Device Slot Element: 1 >> Phys at Slot 23 >> >> *driver versions:-* >> >> >> dev.mps.0.firmware_version: 15.00.00.00 >> dev.mps.0.driver_version: 16.00.00.00-fbsd >> >> >> >> >> >> >> On Tue, Dec 8, 2015 at 3:15 AM, Scott Long <scott4long@yahoo.com> wrote: >> >> Hi, >>> >>> If your situation is accurate and the disk is not responding properly t= o >>> regular >>> commands then it=E2=80=99s unlikely that it will respond to SMART comma= nds >>> either. >>> Sometimes these situations are caused by a bad cable, bad controller, o= r >>> buggy software/firmware, and only rarely will the standard statistics i= n >>> SMART >>> pick up these kinds of errors. SMART is better at tracking wear rates >>> and >>> error rates on the physical media, both HDD and SSD, but even then it= =E2=80=99s >>> hard >>> for it to be accurately predictive or even accurately diagnostic. For >>> your case, >>> I recommend that you describe your hardware and software configuration = in >>> more detail, and look for physical abnormalities in the cabling and >>> connections. >>> Once that is ruled and and the rest of us know what kind of hardware >>> you=E2=80=99re >>> dealing with, we might be able to make better commendations. >>> >>> Scott >>> >>> > On Dec 7, 2015, at 11:07 AM, prateek sethi <prateekrootkey@gmail.com> >>> wrote: >>> > >>> > Hi , >>> > >>> > Is there any way or tool to find out that a disk which is not >>> responding >>> > properly is really bad or not? Sometimes I have seen that there is lo= t >>> of >>> > CDB error for a drive and system reboot makes every thing fine. What >>> can >>> be >>> > reasons for such kind of scenarios? >>> > >>> > I know smartctl is the one which can help. I have some couple of >>> question >>> > regarding this . >>> > >>> > 1. What if disk does not support smartctl? >>> > 2. How I can do smartest use of smartctl command like which parameter= s >>> can >>> > tell that the disk is actually bad? >>> > 3. What other test I can perform to make it sure that disk has >>> completely >>> > gone? >>> > >>> > >>> > Please tell me correct place to ask this question if I am asking at >>> wrong >>> > place. >>> > _______________________________________________ >>> > freebsd-scsi@freebsd.org mailing list >>> > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >>> > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.or= g >>> " >>> >>> >>> _______________________________________________ >> freebsd-scsi@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" >> > > > Have you simply moved the drive to another slot - does the problem follow > the drive? > Unlikely but it could be a backplane issue. > > I don't know about version 15 firmware, I have always used version 16 > firmware > with 9.x to match the driver version. > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABD8d0pH_FTDvDEOA8=-RsuwzgsrABSKCKZTZ=hDM7HBCZ_ppQ>