From owner-freebsd-scsi@freebsd.org Tue Dec 8 13:26:52 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0545E9D4658; Tue, 8 Dec 2015 13:26:52 +0000 (UTC) (envelope-from mikej@mikej.com) Received: from mx2.paymentallianceintl.com (mx2.paymentallianceintl.com [216.26.158.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx2.paymentallianceintl.com", Issuer "Go Daddy Secure Certification Authority" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D13351B1C; Tue, 8 Dec 2015 13:26:51 +0000 (UTC) (envelope-from mikej@mikej.com) Received: from firewall.mikej.com (162-230-214-65.lightspeed.lsvlky.sbcglobal.net [162.230.214.65]) by mx2.paymentallianceintl.com (8.15.1/8.15.1) with ESMTPS id tB8DQhcA087691 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 8 Dec 2015 08:26:43 -0500 (EST) (envelope-from mikej@mikej.com) X-Authentication-Warning: mx2.paymentallianceintl.com: Host 162-230-214-65.lightspeed.lsvlky.sbcglobal.net [162.230.214.65] claimed to be firewall.mikej.com Received: from mail.mikej.com (firewall.mikej.com [192.168.6.63]) by firewall.mikej.com (8.15.2/8.15.2) with ESMTP id tB8DQPSu087326; Tue, 8 Dec 2015 08:26:25 -0500 (EST) (envelope-from mikej@mikej.com) DKIM-Filter: OpenDKIM Filter v2.10.3 firewall.mikej.com tB8DQPSu087326 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mikej.com; s=mail; t=1449581186; bh=z3rmZncuKnDn/f8ahsIWaBKcx7AV8vZ7I1i+oIHIPRI=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=fQXbcsewBWphj/g8syfQQsPFycLfVwd6Q9rokE1nyqW0QCAUa9L1Su8lPHNsJTlz8 ek2idpuCjbQ/pkhW0MQ3nLzmlSb6WuPGE1L8bA8XQjnn+UWObMsD/FNzvUWu0UeZVC 9k/WaAf5LbopquCCCm+cqL78JOWUn22gUI17zQJhJoNTmF7noSv4SRDV9PQq6Amaru F9tsf3cneAb6P8+UcBZUQpktXZKu2cPfwfn2jT/EdaRowEytT/J/XJ0UtlL/Nxb/xq UoAz1ewVoqG81KbzymvGwCsoBPyxGl8h7ZZ0wZy5aJsYst0EyUm2V479detGWorEc4 wHGymgTFDx6HQ== X-Authentication-Warning: firewall.mikej.com: Host firewall.mikej.com [192.168.6.63] claimed to be mail.mikej.com MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Tue, 08 Dec 2015 08:26:24 -0500 From: Michael Jung To: prateek sethi Cc: Scott Long , freebsd-scsi@freebsd.org, owner-freebsd-scsi@freebsd.org Subject: Re: bad disk discovery In-Reply-To: References: <6A7832F8-53EB-4641-8EF6-E0E6175EB52D@yahoo.com> Message-ID: X-Sender: mikej@mikej.com User-Agent: Roundcube Webmail/1.1.3 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2015 13:26:52 -0000 On 2015-12-08 08:00, prateek sethi wrote: > Hi Scott, > Thanks for the your quick response. > > I have different set of hardware . So that's why I want to know how I > can > debug it myself . Is there anyway or procedure using that I can findout > about the situation or the reason for CDB errors or disk command > failure? > > Right now I am giving detail about the setup where I am getting this > issue . > > I am using LSI SAS2008 controller and connected with supermicro > Enclosure > with freebsd 9.3. 16 different disks are there but only one disk is > having > problem. That means contoller and cable are fine. > > Faulty disk info are like:-. > > *smartctl output is:-* > > smartctl -x /dev/da23 > > === START OF INFORMATION SECTION === > Vendor: SEAGATE > Product: ST3600057SS > Revision: 000B > Rotation Rate: 15000 rpm > Form Factor: 3.5 inches > Logical Unit id: 0x5000c5007725173f > Serial number: 6SL8YLPC0000N5030DY7 > Device type: disk > Transport protocol: SAS > Local Time is: Tue Dec 8 18:20:45 2015 IST > *device is NOT READY (e.g. spun down, busy)* > > *Logs:-* > > Dec 8 14:12:01 N1 kernel: da23 at mps0 bus 0 scbus0 target 148 lun 0 > Dec 8 14:12:01 N1 kernel: da23: Fixed > Direct > Access SCSI-5 device > Dec 8 14:12:01 N1 kernel: da23: Serial Number 6SL8YLPC0000N5030DY7 > Dec 8 14:12:01 N1 kernel: da23: 600.000MB/s transfers > Dec 8 14:12:01 N1 kernel: da23: Command Queueing enabled > Dec 8 14:12:01 N1 kernel: da23: *Attempt to query device size failed: > NOT > READY, Logical unit not ready, cause n* > Dec 8 14:12:01 N1 kernel: ses1: da23,pass26: Element descriptor: 'Slot > 24' > Dec 8 14:12:01 N1 kernel: ses1: da23,pass26: SAS Device Slot Element: > 1 > Phys at Slot 23 > > *driver versions:-* > > dev.mps.0.firmware_version: 15.00.00.00 > dev.mps.0.driver_version: 16.00.00.00-fbsd > > > > > > > On Tue, Dec 8, 2015 at 3:15 AM, Scott Long > wrote: > >> Hi, >> >> If your situation is accurate and the disk is not responding properly >> to >> regular >> commands then it’s unlikely that it will respond to SMART commands >> either. >> Sometimes these situations are caused by a bad cable, bad controller, >> or >> buggy software/firmware, and only rarely will the standard statistics >> in >> SMART >> pick up these kinds of errors. SMART is better at tracking wear rates >> and >> error rates on the physical media, both HDD and SSD, but even then >> it’s >> hard >> for it to be accurately predictive or even accurately diagnostic. For >> your case, >> I recommend that you describe your hardware and software configuration >> in >> more detail, and look for physical abnormalities in the cabling and >> connections. >> Once that is ruled and and the rest of us know what kind of hardware >> you’re >> dealing with, we might be able to make better commendations. >> >> Scott >> >> > On Dec 7, 2015, at 11:07 AM, prateek sethi >> wrote: >> > >> > Hi , >> > >> > Is there any way or tool to find out that a disk which is not responding >> > properly is really bad or not? Sometimes I have seen that there is lot of >> > CDB error for a drive and system reboot makes every thing fine. What can >> be >> > reasons for such kind of scenarios? >> > >> > I know smartctl is the one which can help. I have some couple of question >> > regarding this . >> > >> > 1. What if disk does not support smartctl? >> > 2. How I can do smartest use of smartctl command like which parameters >> can >> > tell that the disk is actually bad? >> > 3. What other test I can perform to make it sure that disk has completely >> > gone? >> > >> > >> > Please tell me correct place to ask this question if I am asking at wrong >> > place. >> > _______________________________________________ >> > freebsd-scsi@freebsd.org mailing list >> > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >> > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" >> >> > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" Have you simply moved the drive to another slot - does the problem follow the drive? Unlikely but it could be a backplane issue. I don't know about version 15 firmware, I have always used version 16 firmware with 9.x to match the driver version.