FreeBSD Mail Archives

Date:      Tue, 8 Dec 2015 21:35:21 +0530
From:      prateek sethi <prateekrootkey@gmail.com>
To:        Michael Jung <mikej@mikej.com>
Cc:        Scott Long <scott4long@yahoo.com>, freebsd-scsi@freebsd.org,  owner-freebsd-scsi@freebsd.org
Subject:   Re: bad disk discovery
Message-ID:  <CABD8d0pH_FTDvDEOA8=-RsuwzgsrABSKCKZTZ=hDM7HBCZ_ppQ@mail.gmail.com>
In-Reply-To: <a786f32633126a6d898a1d211dc1dba0@mail.mikej.com>
References:  <CABD8d0rTD2Fu9QsqLKREBcA-nndQrUH8F8DWrPh_KQB64qjy1Q@mail.gmail.com> <6A7832F8-53EB-4641-8EF6-E0E6175EB52D@yahoo.com> <CABD8d0oqXoJFEcKzY3cj%2BVv2X%2B5Mg0ceQ%2Bivn3NP%2BHdMdDArRg@mail.gmail.com> <a786f32633126a6d898a1d211dc1dba0@mail.mikej.com>


Yes, I have tried that one but issue is still there. Other disk are working
fine with the same configuration that means firmware should not be a
problem.


On Tue, Dec 8, 2015 at 6:56 PM, Michael Jung <mikej@mikej.com> wrote:

> On 2015-12-08 08:00, prateek sethi wrote:
>
>> Hi Scott,
>> Thanks for the your quick response.
>>
>> I have different set of hardware . So that's why I want to know how I can
>> debug it myself . Is there anyway or procedure using that I can findout
>> about the situation or the reason for CDB errors or disk command failure?
>>
>> Right now I am giving detail about the setup where I am getting this
>> issue .
>>
>> I am using LSI SAS2008 controller and connected with supermicro Enclosure
>> with freebsd 9.3. 16 different disks are there but only one disk is having
>> problem. That means contoller and cable are fine.
>>
>> Faulty disk info are like:-.
>>
>> *smartctl output is:-*
>>
>> smartctl -x /dev/da23
>>
>> === START OF INFORMATION SECTION ===
>> Vendor:               SEAGATE
>> Product:              ST3600057SS
>> Revision:             000B
>> Rotation Rate:        15000 rpm
>> Form Factor:          3.5 inches
>> Logical Unit id:      0x5000c5007725173f
>> Serial number:        6SL8YLPC0000N5030DY7
>> Device type:          disk
>> Transport protocol:   SAS
>> Local Time is:        Tue Dec  8 18:20:45 2015 IST
>> *device is NOT READY (e.g. spun down, busy)*
>>
>> *Logs:-*
>>
>> Dec  8 14:12:01 N1 kernel: da23 at mps0 bus 0 scbus0 target 148 lun 0
>> Dec  8 14:12:01 N1 kernel: da23: <SEAGATE ST3600057SS 000B> Fixed Direct
>> Access SCSI-5 device
>> Dec  8 14:12:01 N1 kernel: da23: Serial Number 6SL8YLPC0000N5030DY7
>> Dec  8 14:12:01 N1 kernel: da23: 600.000MB/s transfers
>> Dec  8 14:12:01 N1 kernel: da23: Command Queueing enabled
>> Dec  8 14:12:01 N1 kernel: da23: *Attempt to query device size failed: NOT
>> READY, Logical unit not ready, cause n*
>> Dec  8 14:12:01 N1 kernel: ses1: da23,pass26: Element descriptor: 'Slot
>> 24'
>> Dec  8 14:12:01 N1 kernel: ses1: da23,pass26: SAS Device Slot Element: 1
>> Phys at Slot 23
>>
>> *driver versions:-*
>>
>>
>> dev.mps.0.firmware_version: 15.00.00.00
>> dev.mps.0.driver_version: 16.00.00.00-fbsd
>>
>>
>>
>>
>>
>>
>> On Tue, Dec 8, 2015 at 3:15 AM, Scott Long <scott4long@yahoo.com> wrote:
>>
>> Hi,
>>>
>>> If your situation is accurate and the disk is not responding properly to
>>> regular
>>> commands then it’s unlikely that it will respond to SMART commands
>>> either.
>>> Sometimes these situations are caused by a bad cable, bad controller, or
>>> buggy software/firmware, and only rarely will the standard statistics in
>>> SMART
>>> pick up these kinds of errors.  SMART is better at tracking wear rates
>>> and
>>> error rates on the physical media, both HDD and SSD, but even then it’s
>>> hard
>>> for it to be accurately predictive or even accurately diagnostic.  For
>>> your case,
>>> I recommend that you describe your hardware and software configuration in
>>> more detail, and look for physical abnormalities in the cabling and
>>> connections.
>>> Once that is ruled and and the rest of us know what kind of hardware
>>> you’re
>>> dealing with, we might be able to make better commendations.
>>>
>>> Scott
>>>
>>> > On Dec 7, 2015, at 11:07 AM, prateek sethi <prateekrootkey@gmail.com>
>>> wrote:
>>> >
>>> > Hi ,
>>> >
>>> > Is there any way or tool to find out that a disk which is not
>>> responding
>>> > properly is really bad or not? Sometimes I have seen that there is lot
>>> of
>>> > CDB error for a drive and system reboot makes every thing fine. What
>>> can
>>> be
>>> > reasons for such kind of scenarios?
>>> >
>>> > I know smartctl is the one which can help. I have some couple of
>>> question
>>> > regarding this .
>>> >
>>> > 1. What if disk does not support smartctl?
>>> > 2. How I can do smartest use of smartctl command like which parameters
>>> can
>>> > tell that the disk is actually bad?
>>> > 3. What other test I can perform to make it sure that disk has
>>> completely
>>> > gone?
>>> >
>>> >
>>> > Please tell me correct place to ask this question if I am asking at
>>> wrong
>>> > place.
>>> > _______________________________________________
>>> > freebsd-scsi@freebsd.org mailing list
>>> > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>>> > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org
>>> "
>>>
>>>
>>> _______________________________________________
>> freebsd-scsi@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"
>>
>
>
> Have you simply moved the drive to another slot - does the problem follow
> the drive?
> Unlikely but it could be a backplane issue.
>
> I don't know about version 15 firmware, I have always used version 16
> firmware
> with 9.x to match the driver version.
>
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABD8d0pH_FTDvDEOA8=-RsuwzgsrABSKCKZTZ=hDM7HBCZ_ppQ>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation