Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Dec 2015 11:38:25 -0700
From:      Stephen Mcconnell <stephen.mcconnell@avagotech.com>
To:        prateek sethi <prateekrootkey@gmail.com>
Cc:        Michael Jung <mikej@mikej.com>, freebsd-scsi@freebsd.org,  owner-freebsd-scsi@freebsd.org
Subject:   RE: bad disk discovery
Message-ID:  <ea08e546df7d84d072db475201d55983@mail.gmail.com>
In-Reply-To: <CABD8d0qU0RaMdRx63CaTw09PvBvksmueJjZnq6GhMRAv3zmrhw@mail.gmail.com>
References:  <CABD8d0rTD2Fu9QsqLKREBcA-nndQrUH8F8DWrPh_KQB64qjy1Q@mail.gmail.com> <6A7832F8-53EB-4641-8EF6-E0E6175EB52D@yahoo.com>	<CABD8d0oqXoJFEcKzY3cj%2BVv2X%2B5Mg0ceQ%2Bivn3NP%2BHdMdDArRg@mail.gmail.com> <a786f32633126a6d898a1d211dc1dba0@mail.mikej.com>	<CABD8d0pH_FTDvDEOA8=-RsuwzgsrABSKCKZTZ=hDM7HBCZ_ppQ@mail.gmail.com> <48445286cfa5082c78581b2c1e8afb66@mail.gmail.com> <CABD8d0qU0RaMdRx63CaTw09PvBvksmueJjZnq6GhMRAv3zmrhw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
All I can see from that log is thousands of the same error for and Inquiry
command.  It looks like a bad drive to me.  Did you say that you tried
moving the drive and the problem still happens for that drive in a
different slot?



Steve



*From:* prateek sethi [mailto:prateekrootkey@gmail.com]
*Sent:* Tuesday, December 08, 2015 11:01 PM
*To:* Stephen Mcconnell
*Cc:* Michael Jung; freebsd-scsi@freebsd.org; owner-freebsd-scsi@freebsd.or=
g
*Subject:* Re: bad disk discovery



I have seen logs and mainly it is saying that *Logical unit not ready,
cause not reportable*.

I am attaching logs related to da22 and da23.( Previously disk was da22
after reboot it has become da23.)



On Tue, Dec 8, 2015 at 9:50 PM, Stephen Mcconnell <
stephen.mcconnell@avagotech.com> wrote:

Try looking through the system log to see if there is any debug output from
the mps driver, or you can send it to me and I'll take a look.  It might
give us a clue as to what's going on.

Steve McConnell


> -----Original Message-----
> From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd-
> scsi@freebsd.org] On Behalf Of prateek sethi
> Sent: Tuesday, December 08, 2015 9:05 AM
> To: Michael Jung
> Cc: freebsd-scsi@freebsd.org; owner-freebsd-scsi@freebsd.org
> Subject: Re: bad disk discovery
>
> Yes, I have tried that one but issue is still there. Other disk are
> working fine
> with the same configuration that means firmware should not be a problem.
>
>
> On Tue, Dec 8, 2015 at 6:56 PM, Michael Jung <mikej@mikej.com> wrote:
>
> > On 2015-12-08 08:00, prateek sethi wrote:
> >
> >> Hi Scott,
> >> Thanks for the your quick response.
> >>
> >> I have different set of hardware . So that's why I want to know how I
> >> can debug it myself . Is there anyway or procedure using that I can
> >> findout about the situation or the reason for CDB errors or disk
> >> command
> failure?
> >>
> >> Right now I am giving detail about the setup where I am getting this
> >> issue .
> >>
> >> I am using LSI SAS2008 controller and connected with supermicro
> >> Enclosure with freebsd 9.3. 16 different disks are there but only one
> >> disk is having problem. That means contoller and cable are fine.
> >>
> >> Faulty disk info are like:-.
> >>
> >> *smartctl output is:-*
> >>
> >> smartctl -x /dev/da23
> >>
> >> =3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D
> >> Vendor:               SEAGATE
> >> Product:              ST3600057SS
> >> Revision:             000B
> >> Rotation Rate:        15000 rpm
> >> Form Factor:          3.5 inches
> >> Logical Unit id:      0x5000c5007725173f
> >> Serial number:        6SL8YLPC0000N5030DY7
> >> Device type:          disk
> >> Transport protocol:   SAS
> >> Local Time is:        Tue Dec  8 18:20:45 2015 IST
> >> *device is NOT READY (e.g. spun down, busy)*
> >>
> >> *Logs:-*
> >>
> >> Dec  8 14:12:01 N1 kernel: da23 at mps0 bus 0 scbus0 target 148 lun 0
> >> Dec  8 14:12:01 N1 kernel: da23: <SEAGATE ST3600057SS 000B> Fixed
> >> Direct Access SCSI-5 device Dec  8 14:12:01 N1 kernel: da23: Serial
> >> Number 6SL8YLPC0000N5030DY7 Dec  8 14:12:01 N1 kernel: da23:
> >> 600.000MB/s transfers Dec  8 14:12:01 N1 kernel: da23: Command
> >> Queueing enabled Dec  8 14:12:01 N1 kernel: da23: *Attempt to query
> >> device size failed: NOT READY, Logical unit not ready, cause n* Dec
> >> 8 14:12:01 N1 kernel: ses1: da23,pass26: Element descriptor: 'Slot
> >> 24'
> >> Dec  8 14:12:01 N1 kernel: ses1: da23,pass26: SAS Device Slot
> >> Element: 1 Phys at Slot 23
> >>
> >> *driver versions:-*
> >>
> >>
> >> dev.mps.0.firmware_version: 15.00.00.00
> >> dev.mps.0.driver_version: 16.00.00.00-fbsd
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Dec 8, 2015 at 3:15 AM, Scott Long <scott4long@yahoo.com>
> wrote:
> >>
> >> Hi,
> >>>
> >>> If your situation is accurate and the disk is not responding
> >>> properly to regular commands then it=E2=80=99s unlikely that it will =
respond
> >>> to SMART commands either.
> >>> Sometimes these situations are caused by a bad cable, bad
> >>> controller, or buggy software/firmware, and only rarely will the
> >>> standard statistics in SMART pick up these kinds of errors.  SMART
> >>> is better at tracking wear rates and error rates on the physical
> >>> media, both HDD and SSD, but even then it=E2=80=99s hard for it to be
> >>> accurately predictive or even accurately diagnostic.  For your case,
> >>> I recommend that you describe your hardware and software
> >>> configuration in more detail, and look for physical abnormalities in
> >>> the cabling and connections.
> >>> Once that is ruled and and the rest of us know what kind of hardware
> >>> you=E2=80=99re dealing with, we might be able to make better commenda=
tions.
> >>>
> >>> Scott
> >>>
> >>> > On Dec 7, 2015, at 11:07 AM, prateek sethi
> >>> > <prateekrootkey@gmail.com>
> >>> wrote:
> >>> >
> >>> > Hi ,
> >>> >
> >>> > Is there any way or tool to find out that a disk which is not
> >>> responding
> >>> > properly is really bad or not? Sometimes I have seen that there is
> >>> > lot
> >>> of
> >>> > CDB error for a drive and system reboot makes every thing fine.
> >>> > What
> >>> can
> >>> be
> >>> > reasons for such kind of scenarios?
> >>> >
> >>> > I know smartctl is the one which can help. I have some couple of
> >>> question
> >>> > regarding this .
> >>> >
> >>> > 1. What if disk does not support smartctl?
> >>> > 2. How I can do smartest use of smartctl command like which
> >>> > parameters
> >>> can
> >>> > tell that the disk is actually bad?
> >>> > 3. What other test I can perform to make it sure that disk has
> >>> completely
> >>> > gone?
> >>> >
> >>> >
> >>> > Please tell me correct place to ask this question if I am asking
> >>> > at
> >>> wrong
> >>> > place.
> >>> > _______________________________________________
> >>> > freebsd-scsi@freebsd.org mailing list
> >>> > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> >>> > To unsubscribe, send any mail to
> >>> > "freebsd-scsi-unsubscribe@freebsd.org
> >>> "
> >>>
> >>>
> >>> _______________________________________________
> >> freebsd-scsi@freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> >> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org=
"
> >>
> >
> >
> > Have you simply moved the drive to another slot - does the problem
> > follow the drive?
> > Unlikely but it could be a backplane issue.
> >
> > I don't know about version 15 firmware, I have always used version 16
> > firmware with 9.x to match the driver version.
> >
> >
> _______________________________________________
> freebsd-scsi@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ea08e546df7d84d072db475201d55983>