From owner-freebsd-scsi@FreeBSD.ORG Sun Jun 10 00:17:24 2012 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 050B8106566B for ; Sun, 10 Jun 2012 00:17:24 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet02.ebureau.com (internet02.tru-signal.biz [65.127.24.21]) by mx1.freebsd.org (Postfix) with ESMTP id BBA068FC12 for ; Sun, 10 Jun 2012 00:17:23 +0000 (UTC) Received: from service02.office.ebureau.com (service02.office.ebureau.com [192.168.20.15]) by internet02.ebureau.com (Postfix) with ESMTP id 842C7CA5143; Sat, 9 Jun 2012 19:17:17 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by service02.office.ebureau.com (Postfix) with ESMTP id 6990A9E358CD; Sat, 9 Jun 2012 19:17:17 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from service02.office.ebureau.com ([127.0.0.1]) by localhost (service02.office.iscompanies.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zDWitbHjPQ0b; Sat, 9 Jun 2012 19:17:17 -0500 (CDT) Received: from [10.139.109.122] (unknown [192.168.26.22]) by service02.office.ebureau.com (Postfix) with ESMTPSA id 01FFF9E358C6; Sat, 9 Jun 2012 19:17:17 -0500 (CDT) References: In-Reply-To: Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <404902E8-1144-4B39-9C03-FAC38FAA853C@ebureau.com> X-Mailer: iPhone Mail (9B206) From: Dustin Wenz Date: Sat, 9 Jun 2012 19:17:11 -0500 To: Kyle Creyts Cc: "freebsd-scsi@freebsd.org" Subject: Re: Marginal disks prevent boot with mps(4) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Jun 2012 00:17:24 -0000 That workaround is effective, but hard to execute when the system is on the o= ther side of town. It is also difficult to identify the affected disk when t= here are several dozen connected in a JBOD chassis. As Ken suggested, I'm go= ing to investigate possible HBA and expander firmware issues on Monday.=20 - .Dustin On Jun 8, 2012, at 11:38 PM, Kyle Creyts wrote: > Pop the offending disk out, then back in after boot. Consider replacing. >=20 > Dustin Wenz wrote: >=20 > I just installed a build of 9.0-STABLE in order to test the changes since r= elease. I was hoping that some of the error-handling in mps would alter the b= ehavior I've seen with some SATA disks (particularly, Seagate ST3000DM001 di= sks) connected through an LSI SAS 9201-16e HBA. >=20 > It is apparently possible for these disks to get in a state where their pr= esence prevents the machine from booting. This problem has existed for some t= ime, according to some archive-searching I've done, but there isn't much con= sensus on how to fix it. >=20 > The disks are good enough that they can be probed at startup, but some par= t of initialization cannot complete. This is the message I see repeated fore= ver upon boot (the probe number does change slightly): >=20 > (probe14:mps0:0:14:0): INQUIRY. CDB: 12 0 0 0 24 0 length 36 SMID 215 t= erminated ioc 804b scsi 0 state c xfer 0 >=20 > There is a comment in mps_sas.c which suggests that this error is usually t= ransient, but that seems not to be the case here. Can anyone suggest a modif= ication that might permit booting in this state? >=20 > - .Dustin >=20 > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" From owner-freebsd-scsi@FreeBSD.ORG Mon Jun 11 11:07:33 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08D901065673 for ; Mon, 11 Jun 2012 11:07:33 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E71448FC16 for ; Mon, 11 Jun 2012 11:07:32 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q5BB7Wgm053426 for ; Mon, 11 Jun 2012 11:07:32 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q5BB7Woo053424 for freebsd-scsi@FreeBSD.org; Mon, 11 Jun 2012 11:07:32 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 11 Jun 2012 11:07:32 GMT Message-Id: <201206111107.q5BB7Woo053424@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Jun 2012 11:07:33 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/165982 scsi [mpt] mpt instability, drive resets, and losses on Fre o kern/165740 scsi [cam] SCSI code must drain callbacks before free o kern/163713 scsi [aic7xxx] [patch] Add Adaptec29329LPE to aic79xx_pci.c o kern/162256 scsi [mpt] QUEUE FULL EVENT and 'mpt_cam_event: 0x0' o kern/161809 scsi [cam] [patch] set kern.cam.boot_delay via build option o kern/159412 scsi [ciss] 7.3 RELEASE: ciss0 ADAPTER HEARTBEAT FAILED err o kern/157770 scsi [iscsi] [panic] iscsi_initiator panic o kern/154432 scsi [xpt] run_interrupt_driven_hooks: still waiting after o kern/153514 scsi [cam] [panic] CAM related panic o kern/153361 scsi [ciss] Smart Array 5300 boot/detect drive problem o kern/152250 scsi [ciss] [patch] Kernel panic when hw.ciss.expose_hidden o kern/151564 scsi [ciss] ciss(4) should increase CISS_MAX_LOGICAL to 10 o docs/151336 scsi Missing documentation of scsi_ and ata_ functions in c s kern/149927 scsi [cam] hard drive not stopped before removing power dur o kern/148083 scsi [aac] Strange device reporting o kern/147704 scsi [mpt] sys/dev/mpt: new chip revision, partially unsupp o kern/146287 scsi [ciss] ciss(4) cannot see more than one SmartArray con o kern/145768 scsi [mpt] can't perform I/O on SAS based SAN disk in freeb o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/144301 scsi [ciss] [hang] HP proliant server locks when using ciss o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/132250 scsi [ciss] ciss driver does not support more then 15 drive o kern/132206 scsi [mpt] system panics on boot when mirroring and 2nd dri o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/127717 scsi [ata] [patch] [request] - support write cache toggling o kern/123674 scsi [ahc] ahc driver dumping o kern/123520 scsi [ahd] unable to boot from net while using ahd o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 50 problems total. From owner-freebsd-scsi@FreeBSD.ORG Tue Jun 12 17:25:22 2012 Return-Path: Delivered-To: FreeBSD-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8B3F01065672 for ; Tue, 12 Jun 2012 17:25:22 +0000 (UTC) (envelope-from seanbru@yahoo-inc.com) Received: from mrout1-b.corp.bf1.yahoo.com (mrout1-b.corp.bf1.yahoo.com [98.139.253.104]) by mx1.freebsd.org (Postfix) with ESMTP id 43B728FC0A for ; Tue, 12 Jun 2012 17:25:19 +0000 (UTC) Received: from [IPv6:::1] (rideseveral.corp.yahoo.com [10.73.160.231]) by mrout1-b.corp.bf1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id q5CHEvkE041212 for ; Tue, 12 Jun 2012 10:14:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yahoo-inc.com; s=cobra; t=1339521298; bh=flT9TxNBnKdkr954ZMrbCEOg6x16vm+JRAIQTguAtmw=; h=Subject:From:Reply-To:To:Content-Type:Date:Message-ID: Mime-Version:Content-Transfer-Encoding; b=vWNuwcmwVa+T4H2xIdw7yT5vG8AKCEn/0YsGY3DEv7FwB5uhU7vFnGg8UVRoEYfEu IbFcvAgp43BR8ebEQkzxpx14Hd49eyi6r4bDiskwd17UNQv4n9o/a5BHML+BgfSlYG iQzEPAap9GpIW0edwaTL0nj/qA7HtdvHnO8gFAUk= From: Sean Bruno To: "FreeBSD-scsi@freebsd.org" Content-Type: text/plain; charset="UTF-8" Date: Tue, 12 Jun 2012 10:14:57 -0700 Message-ID: <1339521297.2845.4.camel@powernoodle.corp.yahoo.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-Milter-Version: master.31+4-gbc07cd5+ X-CLX-ID: 521297001 Cc: Subject: wonky Dell H200 controller X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: sbruno@freebsd.org List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jun 2012 17:25:22 -0000 on stable/9 ish I see what appears to be a bad thing, but results in no problems on the server. failure at ../../../dev/mps/mps_sas_lsi.c:647/mpssas_add_device()! Could not get ID for device with handle 0x0009 mpssas_fw_work: failed to add device with handle 0x9 mpssas_prepare_remove 506 : invalid handle 0x9 I've had issues with this controller's configuration (trying to mix SATA/SAS disks). Sean From owner-freebsd-scsi@FreeBSD.ORG Tue Jun 12 17:46:29 2012 Return-Path: Delivered-To: FreeBSD-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 355761065670; Tue, 12 Jun 2012 17:46:29 +0000 (UTC) (envelope-from attila.bogar@linguamatics.com) Received: from mail.linguamatics.com (mail.linguamatics.com [188.39.80.203]) by mx1.freebsd.org (Postfix) with ESMTP id E14AD8FC0A; Tue, 12 Jun 2012 17:46:28 +0000 (UTC) Received: from [10.252.10.246] (archangel-wl.linguamatics.com [10.252.10.246]) by mail.linguamatics.com (Postfix) with ESMTPSA id DADE3EFB44F; Tue, 12 Jun 2012 18:37:17 +0100 (BST) Message-ID: <4FD77E4C.8010504@linguamatics.com> Date: Tue, 12 Jun 2012 18:37:16 +0100 From: =?UTF-8?B?QXR0aWxhIEJvZ8Ohcg==?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 MIME-Version: 1.0 To: FreeBSD-scsi@freebsd.org References: <1339521297.2845.4.camel@powernoodle.corp.yahoo.com> In-Reply-To: <1339521297.2845.4.camel@powernoodle.corp.yahoo.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: seanbru@yahoo-inc.com Subject: Re: wonky Dell H200 controller X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jun 2012 17:46:29 -0000 Hi Sean, On 12/06/12 18:14, Sean Bruno wrote: > on stable/9 ish I see what appears to be a bad thing, but results in no > problems on the server. We had some problems with the H200 as well. FYI: It's possible to flash the firmware into a stock LSI IT one using: http://forums.servethehome.com/showthread.php?467-DELL-H200-Flash-to-IT-firmware-Procedure-for-DELL-servers&highlight=h200 After flashed, the H200 won't work in the integrated slot, so you'll need longer mini-SAS cables. This is just a workaround, anyway. Attila From owner-freebsd-scsi@FreeBSD.ORG Tue Jun 12 18:29:47 2012 Return-Path: Delivered-To: FreeBSD-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D5F1106566C; Tue, 12 Jun 2012 18:29:47 +0000 (UTC) (envelope-from seanbru@yahoo-inc.com) Received: from mrout1-b.corp.bf1.yahoo.com (mrout1-b.corp.bf1.yahoo.com [98.139.253.104]) by mx1.freebsd.org (Postfix) with ESMTP id 45E368FC16; Tue, 12 Jun 2012 18:29:47 +0000 (UTC) Received: from [IPv6:::1] (rideseveral.corp.yahoo.com [10.73.160.231]) by mrout1-b.corp.bf1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id q5CITRN8080363; Tue, 12 Jun 2012 11:29:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yahoo-inc.com; s=cobra; t=1339525768; bh=cj4+l2ZxvSBfbS3nSZ7BsVV75uS//EaSbovn5L15yeU=; h=Subject:From:To:Cc:In-Reply-To:References:Content-Type:Date: Message-ID:Mime-Version:Content-Transfer-Encoding; b=ZsTfobWEMIgkm5F81d3WDUfu71YZpRgKUdXG6GUbprGkKeGJRIxzZLn5FJMvCPLS0 AyN1jJw3VTVB4eYsFIa7/EiU91cbs5ThwVly9dGfgU6XrJ5Fthg34tIo+cIs1Y4kPe wALGWpnm0I5DFo7xqn5H8AK84X6ffAjPR9tsXr/M= From: Sean Bruno To: "sbruno@freebsd.org" In-Reply-To: <1339521297.2845.4.camel@powernoodle.corp.yahoo.com> References: <1339521297.2845.4.camel@powernoodle.corp.yahoo.com> Content-Type: text/plain; charset="UTF-8" Date: Tue, 12 Jun 2012 11:29:27 -0700 Message-ID: <1339525767.2845.5.camel@powernoodle.corp.yahoo.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-Milter-Version: master.31+4-gbc07cd5+ X-CLX-ID: 525767020 Cc: "FreeBSD-scsi@freebsd.org" Subject: Re: wonky Dell H200 controller (mps) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jun 2012 18:29:47 -0000 On Tue, 2012-06-12 at 10:14 -0700, Sean Bruno wrote: > on stable/9 ish I see what appears to be a bad thing, but results in no > problems on the server. > > failure at ../../../dev/mps/mps_sas_lsi.c:647/mpssas_add_device()! Could > not get ID for device with handle 0x0009 > mpssas_fw_work: failed to add device with handle 0x9 > mpssas_prepare_remove 506 : invalid handle 0x9 > > > I've had issues with this controller's configuration (trying to mix > SATA/SAS disks). > > Sean I guess I should actually specify which driver is in use here in case its not clear. sean From owner-freebsd-scsi@FreeBSD.ORG Fri Jun 15 17:21:23 2012 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3EEB31065748 for ; Fri, 15 Jun 2012 17:21:23 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 196C98FC12 for ; Fri, 15 Jun 2012 17:21:23 +0000 (UTC) Received: from [192.168.135.100] (c-76-126-166-136.hsd1.ca.comcast.net [76.126.166.136]) (authenticated bits=0) by ns1.feral.com (8.14.4/8.14.4) with ESMTP id q5FHLGBO017386 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Fri, 15 Jun 2012 10:21:16 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4FDB6F06.6080108@feral.com> Date: Fri, 15 Jun 2012 10:21:10 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: scsi@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (ns1.feral.com [192.67.166.1]); Fri, 15 Jun 2012 10:21:17 -0700 (PDT) Cc: Subject: headsup on enclosure driver X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Matt Jacob List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2012 17:21:23 -0000 Doing two things: 1) Removing SEN support. I doubt any of the hardware survived past 1996 and I'll put the code back and eat my freebsd membership card if I'm wrong. It wasn't supported anyway. 2) Default logging via enc_log to not be chatty unless a new sysctl enc_verbose or bootverbose is set. This is all motivated by my hudson case system's log that is filled full of charm like: ses0: Element 6 Beyond End of Additional Element Status Descriptors Will hold off checking in for a day in case anyone cares. From owner-freebsd-scsi@FreeBSD.ORG Fri Jun 15 17:45:49 2012 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EBC36106566B; Fri, 15 Jun 2012 17:45:49 +0000 (UTC) (envelope-from gibbs@scsiguy.com) Received: from aslan.scsiguy.com (www.scsiguy.com [70.89.174.89]) by mx1.freebsd.org (Postfix) with ESMTP id B2D9B8FC08; Fri, 15 Jun 2012 17:45:49 +0000 (UTC) Received: from [192.168.6.149] (207-225-98-3.dia.static.qwest.net [207.225.98.3]) (authenticated bits=0) by aslan.scsiguy.com (8.14.5/8.14.5) with ESMTP id q5FHjmg8045879 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 15 Jun 2012 11:45:49 -0600 (MDT) (envelope-from gibbs@scsiguy.com) Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=iso-8859-1 From: "Justin T. Gibbs" In-Reply-To: <4FDB6F06.6080108@feral.com> Date: Fri, 15 Jun 2012 11:45:43 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4FDB6F06.6080108@feral.com> To: Matt Jacob X-Mailer: Apple Mail (2.1278) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (aslan.scsiguy.com [70.89.174.89]); Fri, 15 Jun 2012 11:45:49 -0600 (MDT) Cc: scsi@freebsd.org Subject: Re: headsup on enclosure driver X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2012 17:45:50 -0000 On Jun 15, 2012, at 11:21 AM, Matthew Jacob wrote: > Doing two things: >=20 > 1) Removing SEN support. I doubt any of the hardware survived past = 1996 and I'll put the code back and eat my freebsd membership card if = I'm wrong. It wasn't supported anyway. >=20 > 2) Default logging via enc_log to not be chatty unless a new sysctl = enc_verbose or bootverbose is set. > This is all motivated by my hudson case system's log that is filled = full of charm like: >=20 > ses0: Element 6 Beyond End of Additional Element Status Descriptors I'm not sure that this error is benign. -- Justin= From owner-freebsd-scsi@FreeBSD.ORG Fri Jun 15 23:06:54 2012 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7A7B106566B for ; Fri, 15 Jun 2012 23:06:54 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet02.ebureau.com (internet02.ebureau.com [65.127.24.21]) by mx1.freebsd.org (Postfix) with ESMTP id 68FEC8FC12 for ; Fri, 15 Jun 2012 23:06:54 +0000 (UTC) Received: from service02.office.ebureau.com (service02.office.ebureau.com [192.168.20.15]) by internet02.ebureau.com (Postfix) with ESMTP id EF449CB4B61 for ; Fri, 15 Jun 2012 18:06:47 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by service02.office.ebureau.com (Postfix) with ESMTP id D53109F0C1D3 for ; Fri, 15 Jun 2012 18:06:47 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from service02.office.ebureau.com ([127.0.0.1]) by localhost (service02.office.iscompanies.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZntZD03PMn4a for ; Fri, 15 Jun 2012 18:06:47 -0500 (CDT) Received: from square.office.iscompanies.com (square.office.iscompanies.com [10.10.20.22]) by service02.office.ebureau.com (Postfix) with ESMTPSA id 744B19F0C1C6 for ; Fri, 15 Jun 2012 18:06:47 -0500 (CDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1257) From: Dustin Wenz In-Reply-To: <20120608215326.GA83721@nargothrond.kdm.org> Date: Fri, 15 Jun 2012 18:06:47 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <551EFA9B-74F7-4CFC-954C-C9E0440E2BDC@ebureau.com> References: <60F17E0E-EE4A-4F37-9925-055315B987B1@ebureau.com> <20120608215326.GA83721@nargothrond.kdm.org> To: freebsd-scsi@freebsd.org X-Mailer: Apple Mail (2.1257) Subject: Re: Marginal disks prevent boot with mps(4) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2012 23:06:54 -0000 I just received a SFF-8088->8087 cable via FedEx this morning, which = allowed me to continue to isolate this problem. What I discovered is that it makes no difference whether a bad disk is = connected to an expander, or if one is connected directly to the HBA. = So, if this is a hardware bug, it must be present in the LSI = SAS2008-based HBA that I'm using. The firmware on the card was also = upgraded from v11.00.00.00 to v13.00.57.00, which is the latest as far = as I am aware. That did not seem to change the behavior. I did notice that earlier during startup, I see this message a page or = so before the endless ioc messages start: mps0: polling failed mpssas_get_sata_identify: poll for page completed with error = 60_mapping_get_dev info: failed to compute the hashed SAS address for SATA device = with handle 0x0009 It seems that the driver knows something is up; even before it gets = stuck later on... So far, the only way I can get this configuration to boot is to change = the status for MPI2_IOCSTATUS_SCSI_IOC_TERMINATED to CAM_REQ_CMP_ERR, as = Ken mentioned. That change will still cause the machine to report some = "ioc terminated" messages, but will not hang the startup process = indefinitely. However, I'm not sure what the implications of making that = change on a production machine would be. If this is LSI's problem, I don't see why they would bother to fix it. = As far as I know, they are the only 6Gb SAS/SATA HBA vendor that works = on FreeBSD. We have no choice but to buy their stuff, even if it's not = robust. - .Dustin On Jun 8, 2012, at 4:53 PM, Kenneth D. Merry wrote: > On Fri, Jun 08, 2012 at 16:25:31 -0500, Dustin Wenz wrote: >> I just installed a build of 9.0-STABLE in order to test the changes = since release. I was hoping that some of the error-handling in mps would = alter the behavior I've seen with some SATA disks (particularly, Seagate = ST3000DM001 disks) connected through an LSI SAS 9201-16e HBA. >>=20 >=20 > Are you using an expander, or are the disks connected directly to the = HBA? >=20 > What firmware version are you using on the HBA? Make sure you have = the > latest firmware version on the card. >=20 >> It is apparently possible for these disks to get in a state where = their presence prevents the machine from booting. This problem has = existed for some time, according to some archive-searching I've done, = but there isn't much consensus on how to fix it. >>=20 >> The disks are good enough that they can be probed at startup, but = some part of initialization cannot complete. This is the message I see = repeated forever upon boot (the probe number does change slightly): >>=20 >> (probe14:mps0:0:14:0): INQUIRY. CDB: 12 0 0 0 24 0 length 36 = SMID 215 terminated ioc 804b scsi 0 state c xfer 0 >>=20 >> There is a comment in mps_sas.c which suggests that this error is = usually transient, but that seems not to be the case here. Can anyone = suggest a modification that might permit booting in this state? >>=20 >=20 > There is not a lot that the driver can do in this case. The command = is > getting terminated by the firmware in the HBA, and we really don't = have a > lot of information to indicate why. >=20 > You could change the status returned for = MPI2_IOCSTATUS_SCSI_IOC_TERMINATED > to CAM_REQ_CMP_ERR, and that would just mean that the probe for that = disk > would eventually fail and the kernel would boot. CAM_REQUEUE_REQ = tells > CAM to retry the command without decrementing the retry count. That = is > why you aren't able to boot. >=20 > If upgrading the HBA firmware doesn't fix the problem, I would suggest > contacting LSI support, and see if they can get additional diagnostics = off > the board to figure out what the problem is. >=20 > Ken > --=20 > Kenneth Merry > ken@FreeBSD.ORG From owner-freebsd-scsi@FreeBSD.ORG Fri Jun 15 23:46:32 2012 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5B218106566B for ; Fri, 15 Jun 2012 23:46:32 +0000 (UTC) (envelope-from kcreyts@merit.edu) Received: from sfpop-ironport01.merit.edu (sfpop-ironport01.merit.edu [207.75.116.67]) by mx1.freebsd.org (Postfix) with ESMTP id 23A278FC12 for ; Fri, 15 Jun 2012 23:46:31 +0000 (UTC) X-Merit-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.75,781,1330923600"; d="scan'208";a="149234953" Received: from merit-mailstore01.merit.edu ([10.108.1.190]) by sfpop-ironport01-ob.merit.edu with ESMTP; 15 Jun 2012 19:45:23 -0400 Date: Fri, 15 Jun 2012 19:45:23 -0400 (EDT) Message-ID: From: Kyle Creyts To: Dustin Wenz , freebsd-scsi@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=UTF-8 X-Mailer: Zimbra 7.2.0_GA_2669 (MobileSync - Android/0.3) Cc: Subject: Re: Marginal disks prevent boot with mps(4) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2012 23:46:32 -0000 Iirc, this is a camctl problem. Dustin Wenz wrote: I just received a SFF-8088->8087 cable via FedEx this morning, which allowed me to continue to isolate this problem. What I discovered is that it makes no difference whether a bad disk is connected to an expander, or if one is connected directly to the HBA. So, if this is a hardware bug, it must be present in the LSI SAS2008-based HBA that I'm using. The firmware on the card was also upgraded from v11.00.00.00 to v13.00.57.00, which is the latest as far as I am aware. That did not seem to change the behavior. I did notice that earlier during startup, I see this message a page or so before the endless ioc messages start: mps0: polling failed mpssas_get_sata_identify: poll for page completed with error 60_mapping_get_dev info: failed to compute the hashed SAS address for SATA device with handle 0x0009 It seems that the driver knows something is up; even before it gets stuck later on... So far, the only way I can get this configuration to boot is to change the status for MPI2_IOCSTATUS_SCSI_IOC_TERMINATED to CAM_REQ_CMP_ERR, as Ken mentioned. That change will still cause the machine to report some "ioc terminated" messages, but will not hang the startup process indefinitely. However, I'm not sure what the implications of making that change on a production machine would be. If this is LSI's problem, I don't see why they would bother to fix it. As far as I know, they are the only 6Gb SAS/SATA HBA vendor that works on FreeBSD. We have no choice but to buy their stuff, even if it's not robust. - .Dustin On Jun 8, 2012, at 4:53 PM, Kenneth D. Merry wrote: > On Fri, Jun 08, 2012 at 16:25:31 -0500, Dustin Wenz wrote: >> I just installed a build of 9.0-STABLE in order to test the changes since release. I was hoping that some of the error-handling in mps would alter the behavior I've seen with some SATA disks (particularly, Seagate ST3000DM001 disks) connected through an LSI SAS 9201-16e HBA. >> > > Are you using an expander, or are the disks connected directly to the HBA? > > What firmware version are you using on the HBA? Make sure you have the > latest firmware version on the card. > >> It is apparently possible for these disks to get in a state where their presence prevents the machine from booting. This problem has existed for some time, according to some archive-searching I've done, but there isn't much consensus on how to fix it. >> >> The disks are good enough that they can be probed at startup, but some part of initialization cannot complete. This is the message I see repeated forever upon boot (the probe number does change slightly): >> >> (probe14:mps0:0:14:0): INQUIRY. CDB: 12 0 0 0 24 0 length 36 SMID 215 terminated ioc 804b scsi 0 state c xfer 0 >> >> There is a comment in mps_sas.c which suggests that this error is usually transient, but that seems not to be the case here. Can anyone suggest a modification that might permit booting in this state? >> > > There is not a lot that the driver can do in this case. The command is > getting terminated by the firmware in the HBA, and we really don't have a > lot of information to indicate why. > > You could change the status returned for MPI2_IOCSTATUS_SCSI_IOC_TERMINATED > to CAM_REQ_CMP_ERR, and that would just mean that the probe for that disk > would eventually fail and the kernel would boot. CAM_REQUEUE_REQ tells > CAM to retry the command without decrementing the retry count. That is > why you aren't able to boot. > > If upgrading the HBA firmware doesn't fix the problem, I would suggest > contacting LSI support, and see if they can get additional diagnostics off > the board to figure out what the problem is. > > Ken > -- > Kenneth Merry > ken@FreeBSD.ORG _______________________________________________ freebsd-scsi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-scsi To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" From owner-freebsd-scsi@FreeBSD.ORG Sat Jun 16 06:29:07 2012 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 92303106564A for ; Sat, 16 Jun 2012 06:29:07 +0000 (UTC) (envelope-from martin@gneto.com) Received: from smtp.mullet.se (smtp.mullet.se [94.247.168.122]) by mx1.freebsd.org (Postfix) with ESMTP id 063588FC0C for ; Sat, 16 Jun 2012 06:29:07 +0000 (UTC) Received: from mbp.gneto.com (ua-83-227-181-30.cust.bredbandsbolaget.se [83.227.181.30]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.mullet.se (Postfix) with ESMTPSA id 6468E6270035 for ; Sat, 16 Jun 2012 08:19:17 +0200 (CEST) Message-ID: <4FDC2564.3070501@gneto.com> Date: Sat, 16 Jun 2012 08:19:16 +0200 From: Martin Nilsson User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:13.0) Gecko/20120601 Thunderbird/13.0 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <60F17E0E-EE4A-4F37-9925-055315B987B1@ebureau.com> <20120608215326.GA83721@nargothrond.kdm.org> <551EFA9B-74F7-4CFC-954C-C9E0440E2BDC@ebureau.com> In-Reply-To: <551EFA9B-74F7-4CFC-954C-C9E0440E2BDC@ebureau.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Marginal disks prevent boot with mps(4) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Jun 2012 06:29:07 -0000 Have you checked that you don't have buggy firmware in the ST3000DM001 drive? Seagate have updates for some versions from last fall on their web. On 2012-06-16 01:06, Dustin Wenz wrote: > I just received a SFF-8088->8087 cable via FedEx this morning, which allowed me to continue to isolate this problem. > > What I discovered is that it makes no difference whether a bad disk is connected to an expander, or if one is connected directly to the HBA. So, if this is a hardware bug, it must be present in the LSI SAS2008-based HBA that I'm using. The firmware on the card was also upgraded from v11.00.00.00 to v13.00.57.00, which is the latest as far as I am aware. That did not seem to change the behavior. > > I did notice that earlier during startup, I see this message a page or so before the endless ioc messages start: > mps0: polling failed > mpssas_get_sata_identify: poll for page completed with error 60_mapping_get_dev > info: failed to compute the hashed SAS address for SATA device with handle 0x0009 > > It seems that the driver knows something is up; even before it gets stuck later on... > > So far, the only way I can get this configuration to boot is to change the status for MPI2_IOCSTATUS_SCSI_IOC_TERMINATED to CAM_REQ_CMP_ERR, as Ken mentioned. That change will still cause the machine to report some "ioc terminated" messages, but will not hang the startup process indefinitely. However, I'm not sure what the implications of making that change on a production machine would be. > > If this is LSI's problem, I don't see why they would bother to fix it. As far as I know, they are the only 6Gb SAS/SATA HBA vendor that works on FreeBSD. We have no choice but to buy their stuff, even if it's not robust. > > - .Dustin > > On Jun 8, 2012, at 4:53 PM, Kenneth D. Merry wrote: > >> On Fri, Jun 08, 2012 at 16:25:31 -0500, Dustin Wenz wrote: >>> I just installed a build of 9.0-STABLE in order to test the changes since release. I was hoping that some of the error-handling in mps would alter the behavior I've seen with some SATA disks (particularly, Seagate ST3000DM001 disks) connected through an LSI SAS 9201-16e HBA. >>> >> Are you using an expander, or are the disks connected directly to the HBA? >> >> What firmware version are you using on the HBA? Make sure you have the >> latest firmware version on the card. >> >>> It is apparently possible for these disks to get in a state where their presence prevents the machine from booting. This problem has existed for some time, according to some archive-searching I've done, but there isn't much consensus on how to fix it. >>> >>> The disks are good enough that they can be probed at startup, but some part of initialization cannot complete. This is the message I see repeated forever upon boot (the probe number does change slightly): >>> >>> (probe14:mps0:0:14:0): INQUIRY. CDB: 12 0 0 0 24 0 length 36 SMID 215 terminated ioc 804b scsi 0 state c xfer 0 >>> >>> There is a comment in mps_sas.c which suggests that this error is usually transient, but that seems not to be the case here. Can anyone suggest a modification that might permit booting in this state? >>> >> There is not a lot that the driver can do in this case. The command is >> getting terminated by the firmware in the HBA, and we really don't have a >> lot of information to indicate why. >> >> You could change the status returned for MPI2_IOCSTATUS_SCSI_IOC_TERMINATED >> to CAM_REQ_CMP_ERR, and that would just mean that the probe for that disk >> would eventually fail and the kernel would boot. CAM_REQUEUE_REQ tells >> CAM to retry the command without decrementing the retry count. That is >> why you aren't able to boot. >> >> If upgrading the HBA firmware doesn't fix the problem, I would suggest >> contacting LSI support, and see if they can get additional diagnostics off >> the board to figure out what the problem is. >> >> Ken >> -- >> Kenneth Merry >> ken@FreeBSD.ORG > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" From owner-freebsd-scsi@FreeBSD.ORG Sat Jun 16 23:06:39 2012 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 90D5D106566C; Sat, 16 Jun 2012 23:06:39 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 4EAD28FC15; Sat, 16 Jun 2012 23:06:39 +0000 (UTC) Received: from [192.168.135.100] (c-76-126-166-136.hsd1.ca.comcast.net [76.126.166.136]) (authenticated bits=0) by ns1.feral.com (8.14.4/8.14.4) with ESMTP id q5GN6VSF021992 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 16 Jun 2012 16:06:32 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4FDD1172.7040302@feral.com> Date: Sat, 16 Jun 2012 16:06:26 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: "Justin T. Gibbs" References: <4FDB6F06.6080108@feral.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (ns1.feral.com [192.67.166.1]); Sat, 16 Jun 2012 16:06:33 -0700 (PDT) Cc: scsi@freebsd.org Subject: Re: headsup on enclosure driver X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: mj@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Jun 2012 23:06:39 -0000 > I'm not sure that this error is benign. > > I'm sure it isn't. But I'm also pretty sure it's crappy h/w. The question is whether you want, by default, the SES driver to spam everything. I would claim not.