From owner-freebsd-scsi@FreeBSD.ORG Mon Feb 3 10:49:04 2014 Return-Path: Delivered-To: freebsd-scsi@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 02F837E6; Mon, 3 Feb 2014 10:49:04 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C9035181C; Mon, 3 Feb 2014 10:49:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s13An3HM016936; Mon, 3 Feb 2014 10:49:03 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s13An3uV016935; Mon, 3 Feb 2014 10:49:03 GMT (envelope-from linimon) Date: Mon, 3 Feb 2014 10:49:03 GMT Message-Id: <201402031049.s13An3uV016935@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-scsi@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/186258: [mps] Heap overrun in mps(4) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:49:04 -0000 Synopsis: [mps] Heap overrun in mps(4) Responsible-Changed-From-To: freebsd-bugs->freebsd-scsi Responsible-Changed-By: linimon Responsible-Changed-When: Mon Feb 3 10:48:43 UTC 2014 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=186258 From owner-freebsd-scsi@FreeBSD.ORG Mon Feb 3 11:06:53 2014 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C82B219D for ; Mon, 3 Feb 2014 11:06:53 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B3E821A5A for ; Mon, 3 Feb 2014 11:06:53 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s13B6rt0022765 for ; Mon, 3 Feb 2014 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s13B6rZ0022763 for freebsd-scsi@FreeBSD.org; Mon, 3 Feb 2014 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 3 Feb 2014 11:06:53 GMT Message-Id: <201402031106.s13B6rZ0022763@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 11:06:53 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/186258 scsi [mps] Heap overrun in mps(4) o kern/184059 scsi [mps] mps SCSI driver causes FreeBSD to hang during bo o kern/179932 scsi [ciss] ciss i/o stall problem with HP Bl Gen8 (and HP o kern/178795 scsi [mps] MSI for mps driver doesn't work under vmware o kern/165982 scsi [mpt] mpt instability, drive resets, and losses on Fre o kern/165740 scsi [cam] SCSI code must drain callbacks before free f kern/162256 scsi [mpt] QUEUE FULL EVENT and 'mpt_cam_event: 0x0' o docs/151336 scsi Missing documentation of scsi_ and ata_ functions in c o kern/148083 scsi [aac] Strange device reporting o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 f kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus f kern/123674 scsi [ahc] ahc driver dumping o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc 16 problems total. From owner-freebsd-scsi@FreeBSD.ORG Mon Feb 3 11:20:13 2014 Return-Path: Delivered-To: freebsd-scsi@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9D93075; Mon, 3 Feb 2014 11:20:13 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 712C41CC9; Mon, 3 Feb 2014 11:20:13 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s13BKDJ8029039; Mon, 3 Feb 2014 11:20:13 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s13BKDWh029038; Mon, 3 Feb 2014 11:20:13 GMT (envelope-from linimon) Date: Mon, 3 Feb 2014 11:20:13 GMT Message-Id: <201402031120.s13BKDWh029038@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-scsi@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/184975: [ses] SCSI Environmental Services (ses) driver report wrong information X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 11:20:13 -0000 Old Synopsis: SCSI Environmental Services (ses) driver report wrong information New Synopsis: [ses] SCSI Environmental Services (ses) driver report wrong information Responsible-Changed-From-To: freebsd-bugs->freebsd-scsi Responsible-Changed-By: linimon Responsible-Changed-When: Mon Feb 3 11:19:56 UTC 2014 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=184975 From owner-freebsd-scsi@FreeBSD.ORG Mon Feb 3 16:23:51 2014 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 64B0F271; Mon, 3 Feb 2014 16:23:51 +0000 (UTC) Received: from smtp.infotech.no (smtp.infotech.no [82.134.31.41]) by mx1.freebsd.org (Postfix) with ESMTP id E3DB618B1; Mon, 3 Feb 2014 16:23:47 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id 30A282041C3; Mon, 3 Feb 2014 17:14:43 +0100 (CET) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IX0Pf4w+iqhg; Mon, 3 Feb 2014 17:14:43 +0100 (CET) Received: from [10.7.0.30] (unknown [10.7.0.30]) by smtp.infotech.no (Postfix) with ESMTPA id 651C22041AF; Mon, 3 Feb 2014 17:14:42 +0100 (CET) Message-ID: <52EFC058.20404@interlog.com> Date: Mon, 03 Feb 2014 11:14:16 -0500 From: Douglas Gilbert User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-scsi@FreeBSD.org Subject: Re: kern/184975: [ses] SCSI Environmental Services (ses) driver report wrong information References: <201402031120.s13BKDWh029038@freefall.freebsd.org> In-Reply-To: <201402031120.s13BKDWh029038@freefall.freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: dgilbert@interlog.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 16:23:51 -0000 On 14-02-03 06:20 AM, linimon@FreeBSD.org wrote: > Old Synopsis: SCSI Environmental Services (ses) driver report wrong information > New Synopsis: [ses] SCSI Environmental Services (ses) driver report wrong information s/Environmental/Enclosure/ In my experience, OSes (e.g. Linux and FreeBSD) can have problems with ses drivers inside the kernel because the enclosure vendors are so sloppy in following the various SES standards. There are exceptions but on average the current implementations destroy what otherwise would have been a good idea. Doug Gilbert From owner-freebsd-scsi@FreeBSD.ORG Mon Feb 3 16:32:30 2014 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 730EB48C; Mon, 3 Feb 2014 16:32:30 +0000 (UTC) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4F272196A; Mon, 3 Feb 2014 16:32:30 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id s13GWJwA013452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 3 Feb 2014 08:32:19 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id s13GWJsi013451; Mon, 3 Feb 2014 08:32:19 -0800 (PST) (envelope-from sgk) Date: Mon, 3 Feb 2014 08:32:19 -0800 From: Steve Kargl To: John Baldwin Subject: Re: Instant panic CAM or USB subsystem Message-ID: <20140203163219.GA13386@troutmask.apl.washington.edu> References: <20140125172106.GA67590@troutmask.apl.washington.edu> <201401281232.21958.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201401281232.21958.jhb@freebsd.org> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: Alexander Motin , freebsd-current@freebsd.org, scsi@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 16:32:30 -0000 On Tue, Jan 28, 2014 at 12:32:21PM -0500, John Baldwin wrote: > On Saturday, January 25, 2014 12:21:06 pm Steve Kargl wrote: > > If I plug my Samsung Intensity II cellphone into a usb port, > > I get an instant panic. This is 100% reproducible. I have > > the core and kernel for further debugging. Dmesg.boot follows > > my sig. > > > > % kgdb /boot/kernel/kernel /vmcore.0 > > > > Unread portion of the kernel message buffer: > > cd1 at umass-sim1 bus 1 scbus4 target 0 lun 0 > > cd1: Removable CD-ROM SCSI-2 device > > cd1: Serial Number 000000000002 > > cd1: 1.000MB/s transfers > > cd1: cd present [3840000 x 512 byte records] > > cd1: quirks=0x10<10_BYTE_ONLY> > > panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301 > > cpuid = 0 > > KDB: enter: panic > > scsi@ might work better for this. It looks like when cdasync() calls > cam_periph_alloc() it doesn't have its associated xpt_path locked. All the > other async xpt callbacks I looked at don't lock the xpt path either. It > seems they expect it to be locked by the caller when they are invoked. It > seems xpt_async_process_dev() doesn't always lock xpt_lock, but sometimes > locks the device instead: > > /* > * If async for specific device is to be delivered to > * the wildcard client, take the specific device lock. > * XXX: We may need a way for client to specify it. > */ > if ((device->lun_id == CAM_LUN_WILDCARD && > path->device->lun_id != CAM_LUN_WILDCARD) || > (device->target->target_id == CAM_TARGET_WILDCARD && > path->target->target_id != CAM_TARGET_WILDCARD) || > (device->target->bus->path_id == CAM_BUS_WILDCARD && > path->target->bus->path_id != CAM_BUS_WILDCARD)) { > mtx_unlock(&device->device_mtx); > xpt_path_lock(path); > relock = 1; > } else > relock = 0; > > (*(device->target->bus->xport->async))(async_code, > device->target->bus, device->target, device, async_arg); > xpt_async_bcast(&device->asyncs, async_code, path, async_arg); > > if (relock) { > xpt_path_unlock(path); > mtx_lock(&device->device_mtx); > } > > Maybe try going up to this frame (16) in your dump and do > 'p *device->target'? However, someone with more CAM knowledge needs to look > at this to see what is actually broken. > I finally have time to look at this again. Here's kgdb for frame 16 as you suggested and then frame 17. Script started on Mon Feb 3 08:16:32 2014 % kgdb /dsk1/obj/usr/src/sys/MOBILE/kernel.debug vmcore.0 Unread portion of the kernel message buffer: panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301 cpuid = 1 KDB: enter: panic #16 0xc047d6a5 in xpt_async_process_dev (device=, arg=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4208 #17 0xc047b346 in xpt_async_process (periph=0x0, ccb=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4173 #18 0xc047bd15 in xpt_done_process (ccb_h=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:5249 #19 0xc047ef14 in xpt_done_td (arg=) at /usr/src/sys/cam/cam_xpt.c:5276 #20 0xc0723daf in fork_exit (callout=0xc047edb0 ) at /usr/src/sys/kern/kern_fork.c:977 #21 0xc09fb3e4 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:278 Current language: auto; currently minimal (kgdb) frame 16 #16 0xc047d6a5 in xpt_async_process_dev (device=, arg=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4208 4208 cur_entry->callback(cur_entry->callback_arg, (kgdb) p *device Cannot access memory at address 0x0 (kgdb) up 1 #17 0xc047b346 in xpt_async_process (periph=0x0, ccb=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4173 4173 xpt_async_process_dev(xpt_periph->path->device, ccb); (kgdb) p *xpt_periph->path->device->target $2 = {ed_entries = {tqh_first = 0xc6f4b800, tqh_last = 0xc6f4b80c}, links = { tqe_next = 0x0, tqe_prev = 0xc6eaaa00}, bus = 0xc6eaaa00, target_id = 4294967295, refcount = 2, generation = 1, last_reset = { tv_sec = 0, tv_usec = 0}, rpl_size = 0, luns = 0x0, luns_mtx = { lock_object = {lo_name = 0xc0a3f9bc "CAM LUNs lock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}} (kgdb) p *xpt_periph->path->device->target->bus $3 = {et_entries = {tqh_first = 0xc6eaa980, tqh_last = 0xc6eaa988}, links = { tqe_next = 0x0, tqe_prev = 0xc7690008}, path_id = 4294967295, sim = 0xc6eaaa80, last_reset = {tv_sec = 0, tv_usec = 0}, flags = 0, refcount = 3, generation = 3, parent_dev = 0x0, xport = 0xc0b2f568, eb_mtx = {lock_object = {lo_name = 0xc0a3f85a "CAM bus lock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}} (kgdb) quit % exit exit Script done on Mon Feb 3 08:20:44 2014 -- Steve From owner-freebsd-scsi@FreeBSD.ORG Mon Feb 3 22:00:43 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A971F75D; Mon, 3 Feb 2014 22:00:43 +0000 (UTC) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [IPv6:2001:470:8b2d:1e1c:21b:21ff:feb8:d7b0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 71E8A18E4; Mon, 3 Feb 2014 22:00:43 +0000 (UTC) Received: from khavrinen.csail.mit.edu (localhost [127.0.0.1]) by khavrinen.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s13M0fei087879 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL CN=khavrinen.csail.mit.edu issuer=Client+20CA); Mon, 3 Feb 2014 17:00:41 -0500 (EST) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.14.7/8.14.7/Submit) id s13M0fJq087876; Mon, 3 Feb 2014 17:00:41 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21232.4489.544435.898780@khavrinen.csail.mit.edu> Date: Mon, 3 Feb 2014 17:00:41 -0500 From: Garrett Wollman To: "Kenneth D. Merry" Subject: Re: Heap overflow in mps(4) (was: Re: stable/9 mps(4) rev 254938 == BOOM!) In-Reply-To: <20140131003342.GA11755@nargothrond.kdm.org> References: <21225.19508.683025.581620@khavrinen.csail.mit.edu> <201401292137.s0TLbD5G006716@hergotha.csail.mit.edu> <20140129221514.GA47535@nargothrond.kdm.org> <21225.38749.179621.454579@khavrinen.csail.mit.edu> <20140131003342.GA11755@nargothrond.kdm.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (khavrinen.csail.mit.edu [127.0.0.1]); Mon, 03 Feb 2014 17:00:41 -0500 (EST) Cc: freebsd-scsi@freebsd.org, scottl@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 22:00:43 -0000 < said: > The attached patch should fix the leaked allocations. I'm CCing Steve and > Kashyap at LSI so that they can verify that this is the right place to do > the mapping shutdown. It does fix the leak. > I don't know yet why that particular change is causing problems. Perhaps > it just moved things around and exposed an existing problem. > The fact that the redzone code doesn't expose any problems makes it more > likely that it is a problem other than a heap overflow. > Since it is consistent, is there any chance you could hook up remote gdb to > the box and poke around when it crashes? Perhaps you'll see something > interesting that will point to the problem. No way to do a remote GDB, unfortunately. However, I tried a few other things: - It makes no difference whether mps.ko is preloaded or loaded in single-user mode. - If I boot a kernel/modules without redzone, loading mps.ko instapanics, in a very different place (apologies for the poor transcription; I can either be up in the machine room to plug in USB sticks or use the serial console, not both): --- trap 0xc, rip = 0xffff....f807e934a, rsp = 0xff...94da4c48f0, rbp = 0xff...94da4c4950 --- bzero() at bzero+0xa/frame 0xff...94da4c4af0 mpssas_add_device() at mpssas_add_device+0x78/frame 0xff..94da4c4af0 mpssas_firmware_event_work() at mpssas_firmware_event_work+0x437/frame 0xff....94da4c4b78 taskqueue_run_locked() at taskqueue_run_locked+0x74/frame 0xff..94da4c4bc0 taskqueue_thread_loop() at taskqueue_thread_loop+0x46/frame 0xff..94da4c4be0 Inspection of the code does not reveal any arc from mpssas_add_device to bzero. The return address in the frame is the location of the first function call (to mpssas_startup_increment()) in mpssas_add_device(). So I think it's fair to say that something is scribbling over memory in quite a bad way. Two things that may be relevant: on boot, this server's MPT2 BIOS always complains "adapter configuration may have changed", and I haven't discovered anything in the configuration utility that changes this. Also, on boot, I always get the following messages: failure at /usr/src-9-stable/sys/dev/mps/mps_sas_lsi.c:667/mpssas_add_device()! Could not get ID for device with handle 0x0010 mpssas_fw_work: failed to add device with handle 0x10 This has been true across mps(4) revisions, on all three copies of this hardware that I have in service. -GAWollman From owner-freebsd-scsi@FreeBSD.ORG Tue Feb 4 07:39:42 2014 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 30C01A29; Tue, 4 Feb 2014 07:39:42 +0000 (UTC) Received: from mail-ee0-f41.google.com (mail-ee0-f41.google.com [74.125.83.41]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 65F091C1F; Tue, 4 Feb 2014 07:39:40 +0000 (UTC) Received: by mail-ee0-f41.google.com with SMTP id e51so1935040eek.28 for ; Mon, 03 Feb 2014 23:39:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=63JPcyGQBiTBLad94svl/IPeP2Dhluug8kcL3MhloBI=; b=lo4du05u/+Ny5joYVWxyACv33yJ2Lxe/6QE7D4a6cE+JzLAlR8ZuWUyyD4jbIdQjRW gW7ubPNX9sCSdxN5yjcQBAD9xDMuCsan+gWLZ2r9xuact0kKCG0ralz8OZiKuq9MOAey mizK1o2LHv5FcChcDhCNM9g+dlcmuNuvDd6CAzItcTf9t582MpI8pYPil4T/D1oDZpCE 5P1TGMe3rFqENKDtn2dxMpa1YY8k3dCsj1LtXbEPfSQiUK+v26vo1vAi0sdm7n45WOnM GbakhJE2bQKqxW34uHmBCLk/JhOLMDchwDlHQW56qH02dWTWZb1ZJeiEjJni0EUEFJn1 ZTaA== X-Received: by 10.14.29.6 with SMTP id h6mr1034131eea.84.1391499544433; Mon, 03 Feb 2014 23:39:04 -0800 (PST) Received: from mavbook.mavhome.dp.ua ([134.249.139.101]) by mx.google.com with ESMTPSA id m9sm71924582eeh.3.2014.02.03.23.39.02 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Feb 2014 23:39:03 -0800 (PST) Sender: Alexander Motin Message-ID: <52F09914.5040202@FreeBSD.org> Date: Tue, 04 Feb 2014 09:39:00 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Steve Kargl , John Baldwin Subject: Re: Instant panic CAM or USB subsystem References: <20140125172106.GA67590@troutmask.apl.washington.edu> <201401281232.21958.jhb@freebsd.org> <20140128195842.GA83173@troutmask.apl.washington.edu> In-Reply-To: <20140128195842.GA83173@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, scsi@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 07:39:42 -0000 On 28.01.2014 21:58, Steve Kargl wrote: > On Tue, Jan 28, 2014 at 12:32:21PM -0500, John Baldwin wrote: >> On Saturday, January 25, 2014 12:21:06 pm Steve Kargl wrote: >>> If I plug my Samsung Intensity II cellphone into a usb port, >>> I get an instant panic. This is 100% reproducible. I have >>> the core and kernel for further debugging. Dmesg.boot follows >>> my sig. >>> >>> % kgdb /boot/kernel/kernel /vmcore.0 >>> >>> Unread portion of the kernel message buffer: >>> cd1 at umass-sim1 bus 1 scbus4 target 0 lun 0 >>> cd1: Removable CD-ROM SCSI-2 device >>> cd1: Serial Number 000000000002 >>> cd1: 1.000MB/s transfers >>> cd1: cd present [3840000 x 512 byte records] >>> cd1: quirks=0x10<10_BYTE_ONLY> >>> panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301 >>> cpuid = 0 >>> KDB: enter: panic >> >> scsi@ might work better for this. It looks like when cdasync() calls >> cam_periph_alloc() it doesn't have its associated xpt_path locked. All the >> other async xpt callbacks I looked at don't lock the xpt path either. It >> seems they expect it to be locked by the caller when they are invoked. It >> seems xpt_async_process_dev() doesn't always lock xpt_lock, but sometimes >> locks the device instead: >> >> /* >> * If async for specific device is to be delivered to >> * the wildcard client, take the specific device lock. >> * XXX: We may need a way for client to specify it. >> */ >> if ((device->lun_id == CAM_LUN_WILDCARD && >> path->device->lun_id != CAM_LUN_WILDCARD) || >> (device->target->target_id == CAM_TARGET_WILDCARD && >> path->target->target_id != CAM_TARGET_WILDCARD) || >> (device->target->bus->path_id == CAM_BUS_WILDCARD && >> path->target->bus->path_id != CAM_BUS_WILDCARD)) { >> mtx_unlock(&device->device_mtx); >> xpt_path_lock(path); >> relock = 1; >> } else >> relock = 0; >> >> (*(device->target->bus->xport->async))(async_code, >> device->target->bus, device->target, device, async_arg); >> xpt_async_bcast(&device->asyncs, async_code, path, async_arg); >> >> if (relock) { >> xpt_path_unlock(path); >> mtx_lock(&device->device_mtx); >> } >> >> Maybe try going up to this frame (16) in your dump and do >> 'p *device->target'? However, someone with more CAM knowledge needs to look >> at this to see what is actually broken. >> >> It seems a bit odd that it thinks your phone is a CD player. > > Thanks for the follow-up. I poked around a bit, but don't > recall looking at *device->target. Under Windows, 3 > filesystems show up, and the one causing problems is listed > as CDFS. I guess problem may be not that phone is reported as CD, but that it is reported as several CDs on one target. Note that you already see cd1 reported, but another one was still trying to allocate when system panicked. I think that CAM CD driver incorrectly assumes that your device is CD changer. I've pulled real 5-disk SCSI CD changer from my depths of my table and got panic very much like yours just on boot. It seems that respective changer code was not properly re-locked during recent CAM locking project. I am going to analyze this case deeper to fix in properly, while for your case I can propose such quick quirk: --- scsi_cd.c (revision 261448) +++ scsi_cd.c (working copy) @@ -223,6 +223,10 @@ static struct cd_quirk_entry cd_quirk_table[] = { { T_CDROM, SIP_MEDIA_REMOVABLE, "CHINON", "CD-ROM CDS-535","*"}, /* quirks */ CD_Q_BCD_TRACKS + }, + { + { T_CDROM, SIP_MEDIA_REMOVABLE, "SAMSUNG", "CD-ROM","1.00"}, + /* quirks */ CD_Q_NO_CHANGER } }; -- Alexander Motin From owner-freebsd-scsi@FreeBSD.ORG Tue Feb 4 22:09:59 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A0F56B4C; Tue, 4 Feb 2014 22:09:59 +0000 (UTC) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [IPv6:2001:470:8b2d:1e1c:21b:21ff:feb8:d7b0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 67B151210; Tue, 4 Feb 2014 22:09:59 +0000 (UTC) Received: from khavrinen.csail.mit.edu (localhost [127.0.0.1]) by khavrinen.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s14M9v7P098695 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL CN=khavrinen.csail.mit.edu issuer=Client+20CA); Tue, 4 Feb 2014 17:09:57 -0500 (EST) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.14.7/8.14.7/Submit) id s14M9vMo098692; Tue, 4 Feb 2014 17:09:57 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21233.25909.355102.743155@khavrinen.csail.mit.edu> Date: Tue, 4 Feb 2014 17:09:57 -0500 From: Garrett Wollman To: "Kenneth D. Merry" Subject: Re: Heap overflow in mps(4) (was: Re: stable/9 mps(4) rev 254938 == BOOM!) In-Reply-To: <20140131003342.GA11755@nargothrond.kdm.org> References: <21225.19508.683025.581620@khavrinen.csail.mit.edu> <201401292137.s0TLbD5G006716@hergotha.csail.mit.edu> <20140129221514.GA47535@nargothrond.kdm.org> <21225.38749.179621.454579@khavrinen.csail.mit.edu> <20140131003342.GA11755@nargothrond.kdm.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (khavrinen.csail.mit.edu [127.0.0.1]); Tue, 04 Feb 2014 17:09:57 -0500 (EST) Cc: freebsd-scsi@freebsd.org, scottl@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 22:09:59 -0000 < said: > The fact that the redzone code doesn't expose any problems makes it more > likely that it is a problem other than a heap overflow. So I built a new kernel with DEBUG_MEMGUARD. When vm.memguard.desc="mps", everything works fine both through two load/unload cycles and statically compiled into the kernel. When vm.memguard.desc is not set, instapanic as before. (I'm trying memguard rather than redzone as it has much less of a performance impact, so I can start doing some of the performance testing I was originally intending to do. Are there any debugging options that I could usefully enable that would show just what mps is doing when the fault happens? I see that there are lots of tracing options but I don't know what would actually be useful. -GAWollman From owner-freebsd-scsi@FreeBSD.ORG Tue Feb 4 23:34:57 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 82869720 for ; Tue, 4 Feb 2014 23:34:57 +0000 (UTC) Received: from nm28-vm3.bullet.mail.ne1.yahoo.com (nm28-vm3.bullet.mail.ne1.yahoo.com [98.138.91.158]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3F52A1A43 for ; Tue, 4 Feb 2014 23:34:56 +0000 (UTC) Received: from [98.138.101.132] by nm28.bullet.mail.ne1.yahoo.com with NNFMP; 04 Feb 2014 23:34:50 -0000 Received: from [98.138.226.56] by tm20.bullet.mail.ne1.yahoo.com with NNFMP; 04 Feb 2014 23:34:45 -0000 Received: from [127.0.0.1] by smtp207.mail.ne1.yahoo.com with NNFMP; 04 Feb 2014 23:34:45 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391556885; bh=rxZw2dsf4UHe/Sc+SnwRZg6sr1Ok71SgknlkGhVc8N0=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=h8R7Lax0flGdhDs/Q6iauYUmtUQHFFHtWsjZUXTGxA2Og0a+lyabQWWbRfD/a+3593mQ+fdircNba2hunngEsXUW0bXS4ISC/H2jFIjCvXQ8Uyn5WsmN+PpzvENS+x8Nn3uGJ2F8PdRT7tj7KaxVO9JRr136j1rIvbffu6oF9XU= X-Yahoo-Newman-Id: 796784.87775.bm@smtp207.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: p1KzD_EVM1n5AsdNmuVw5Ea9tSJeRDQxjjXsagOgc6gBsPo d5YHDgWG6PTqT9rbcuJbL33iXpJw3xgvi5Uw9xN6yE8j9y_n0IuHfxLr3VQb yc7M0uyzSq3k_r.1B0f3_EQIb4uAN80z_RCG_1lHernfEZuVJocsg11iH6nW VBtto9FkfpH8lB0s_5fWNv.cvrn99MQaGRK6tN.IlGji3t.xLJpAHdsqmzhe hfb7GuBBZCyX5fqm5ilszP.44eAzqRhH2PPifotjIkBTfi1yVFwFK7NkmymU IcNrp6.kWZiL0ZlTbNNhoCJp_Z8OQylNMuI85X2bmYvtaJLzdrQF5QDcWxQw DW9Sfq0OG7pUI.dL4MynGbBLIRqWF8PqrVTGussdnnGZ3afsUdCojBWN8fQF gRAnr6QfOO7g0O1k.1bzZTTLx1rskrZGsKReIV6er31w0FJ_DrxYjQnZTPbk 76SYQvpTBrfXA9dg26GMeKy_QJxSsM0sL9FHYNmU5n798YlP3jYoaF0LXsW8 A4j_qQ21kVcwhvkNobm.KOePKH8yHlAk467Rx8gv86yqWZ2PncN8d210sSHa BmL.40IEV X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from lgmac-eding.corp.netflix.com (scott4long@69.53.236.251 with plain [63.250.193.228]) by smtp207.mail.ne1.yahoo.com with SMTP; 04 Feb 2014 15:34:40 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: Heap overflow in mps(4) (was: Re: stable/9 mps(4) rev 254938 == BOOM!) From: Scott Long In-Reply-To: <21233.25909.355102.743155@khavrinen.csail.mit.edu> Date: Tue, 4 Feb 2014 16:34:36 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <21225.19508.683025.581620@khavrinen.csail.mit.edu> <201401292137.s0TLbD5G006716@hergotha.csail.mit.edu> <20140129221514.GA47535@nargothrond.kdm.org> <21225.38749.179621.454579@khavrinen.csail.mit.edu> <20140131003342.GA11755@nargothrond.kdm.org> <21233.25909.355102.743155@khavrinen.csail.mit.edu> To: Garrett Wollman X-Mailer: Apple Mail (2.1827) Cc: freebsd-stable@freebsd.org, "Kenneth D. Merry" , "FreeBSD-scsi@freebsd.org" X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 23:34:57 -0000 On Feb 4, 2014, at 3:09 PM, Garrett Wollman = wrote: > < said: >=20 >> The fact that the redzone code doesn't expose any problems makes it = more >> likely that it is a problem other than a heap overflow. >=20 > So I built a new kernel with DEBUG_MEMGUARD. When > vm.memguard.desc=3D"mps", everything works fine both through two > load/unload cycles and statically compiled into the kernel. When > vm.memguard.desc is not set, instapanic as before. (I'm trying > memguard rather than redzone as it has much less of a performance > impact, so I can start doing some of the performance testing I was > originally intending to do. >=20 > Are there any debugging options that I could usefully enable that > would show just what mps is doing when the fault happens? I see that > there are lots of tracing options but I don't know what would actually > be useful. >=20 Try the patch at http://people.freebsd.org/~scottl/mps.memguard.diff I haven=92t even compile tested it, so hopefully any mistakes are easy = to fix and aren=92t too embarrassing. The target array is an obvious culprit = since it=92s often indexed without bounds. If this doesn=92t fix it then I=92ll have = to think of some other culprits. Another next step would be to further divide and = test the M_MPT2 malloc allocation type. Scott From owner-freebsd-scsi@FreeBSD.ORG Wed Feb 5 02:08:14 2014 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A944CEAD; Wed, 5 Feb 2014 02:08:14 +0000 (UTC) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 85F5F1A70; Wed, 5 Feb 2014 02:08:14 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id s152846x054127 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 4 Feb 2014 18:08:04 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id s15284XP054126; Tue, 4 Feb 2014 18:08:04 -0800 (PST) (envelope-from sgk) Date: Tue, 4 Feb 2014 18:08:04 -0800 From: Steve Kargl To: Alexander Motin Subject: Re: Instant panic CAM or USB subsystem Message-ID: <20140205020804.GA54095@troutmask.apl.washington.edu> References: <20140125172106.GA67590@troutmask.apl.washington.edu> <201401281232.21958.jhb@freebsd.org> <20140128195842.GA83173@troutmask.apl.washington.edu> <52F09914.5040202@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52F09914.5040202@FreeBSD.org> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: freebsd-current@freebsd.org, John Baldwin , scsi@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 02:08:14 -0000 On Tue, Feb 04, 2014 at 09:39:00AM +0200, Alexander Motin wrote: > > I guess problem may be not that phone is reported as CD, but that it is > reported as several CDs on one target. Note that you already see cd1 > reported, but another one was still trying to allocate when system panicked. Good guess see below. > I think that CAM CD driver incorrectly assumes that your device is CD > changer. I've pulled real 5-disk SCSI CD changer from my depths of my > table and got panic very much like yours just on boot. It seems that > respective changer code was not properly re-locked during recent CAM > locking project. If you come up with a patch, I can test it for you. > I am going to analyze this case deeper to fix in properly, while for > your case I can propose such quick quirk: > > --- scsi_cd.c (revision 261448) > +++ scsi_cd.c (working copy) > @@ -223,6 +223,10 @@ static struct cd_quirk_entry cd_quirk_table[] = > { > { T_CDROM, SIP_MEDIA_REMOVABLE, "CHINON", "CD-ROM > CDS-535","*"}, > /* quirks */ CD_Q_BCD_TRACKS > + }, > + { > + { T_CDROM, SIP_MEDIA_REMOVABLE, "SAMSUNG", "CD-ROM","1.00"}, > + /* quirks */ CD_Q_NO_CHANGER > } > }; > With your quirk, the laptop booted and plugging in the cellphone does not cause a panic. :-) dmesg shows ugen3.2: at usbus3 umass1: on usbus3 cd1 at umass-sim1 bus 1 scbus5 target 0 lun 0 cd1: Removable CD-ROM SCSI-2 device cd1: Serial Number 000000000002 cd1: 1.000MB/s transfers cd1: cd present [3840000 x 512 byte records] cd1: quirks=0x14 cd2 at umass-sim1 bus 1 scbus5 target 0 lun 1 cd2: Removable CD-ROM SCSI-2 device cd2: Serial Number 000000000002 cd2: 1.000MB/s transfers cd2: cd present [1084 x 512 byte records] cd2: quirks=0x14 After a few seconds, the cellphone display shows > Sync Music to Phone > Sync Music to Card > Copy/Move Files and the following appears in dmesg ugen3.2: at usbus3 (disconnected) umass1: at uhub3, port 2, addr 2 (disconnected) cd1 at umass-sim1 bus 1 scbus5 target 0 lun 0 cd1: s/n 000000000002 detached cd2 at umass-sim1 bus 1 scbus5 target 0 lun 1 cd2: s/n 000000000002 detached (cd2:umass-sim1:1:0:1): Periph destroyed (cd1:umass-sim1:1:0:0): Periph destroyed ugen3.2: at usbus3 This is fine with me as I only use the laptop as a charging station. -- Steve From owner-freebsd-scsi@FreeBSD.ORG Wed Feb 5 19:21:43 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BD44CC87; Wed, 5 Feb 2014 19:21:43 +0000 (UTC) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 8C0D71DB6; Wed, 5 Feb 2014 19:21:43 +0000 (UTC) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 05 Feb 2014 11:26:08 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.4/8.14.4) with ESMTP id s15JLaUp072430; Wed, 5 Feb 2014 11:21:36 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.4/8.14.4/Submit) id s15JLatT072429; Wed, 5 Feb 2014 11:21:36 -0800 (PST) (envelope-from ambrisko) Date: Wed, 5 Feb 2014 11:21:36 -0800 From: Doug Ambrisko To: Mark Johnston Subject: Re: mfi(4) support for MegaRAID Fury cards Message-ID: <20140205192136.GA71309@ambrisko.com> References: <20131227220455.GA6027@charmander.home> <20140124190832.GB28724@ambrisko.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140124190832.GB28724@ambrisko.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-scsi@freebsd.org, ambrisko@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 19:21:43 -0000 On Fri, Jan 24, 2014 at 11:08:32AM -0800, Doug Ambrisko wrote: | On Fri, Dec 27, 2013 at 05:04:55PM -0500, Mark Johnston wrote: | | Hello, | | | | The patch here adds mfi(4) support for my LSI 9341-4i controller, which | | has device ID 0x5f: | | | | http://people.freebsd.org/~markj/patches/mfi_fury.diff | | | | This diff was mostly obtained by going through the mrsas(4) code | | specific to Invader (DID 0x5d) and Fury (DID 0x5f) controllers. The main | | change is to add an end-of-list marker to scatter-gather DMA lists | | before handing them to the firmware. Without this, large writes to an | | mfi(4) volume result in a firmware crash loop, and the system needs to | | be reset. The diff adds code for both Invader and Fury cards, as this is | | what's done in mrsas(4); I haven't tested with an Invader card though, | | as I don't have access to one. With this patch, I'm able to boot FreeBSD | | 8.2 off of a RAID 1 volume on my 9341-4i. | | | | Would anyone be able to review or test this patch? I'm particularly | | interested if anyone could try it out with an Invader or Fury card | | (there shouldn't be any differences in driver behaviour with other | | cards). | | The patch looks good. I can test it out on a Invader card that I have. | I don't have a Fury card. I was holding off waiting to see how we | should resolve the mrsas(4) driver from LSI conflict. We have been | looking at what needs to be done to get mrsas(4) into FreeBSD. I | posted a change to FreeBSD SCSI list to add a tunable to reduce | the probe priority of mfi(4) for ThunderBolt and later cards. This | way they can both be in the GENERIC kernel etc. and not have an | issue. We'll need to do some minor updates to your patch to work | with that since I added another flag in the ident area. After fixing the merge conflict with my recent change it works with my Invader card. I don't see any issues with the patch. Do you want to redo the patch and then commit it or just commit once you've made the change. Please make sure you do it with -current. After this we should plan to MFC these changes all the way back to 8-stable. Thanks, Doug A. From owner-freebsd-scsi@FreeBSD.ORG Thu Feb 6 02:57:40 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 605554DE; Thu, 6 Feb 2014 02:57:40 +0000 (UTC) Received: from mail-ie0-x22f.google.com (mail-ie0-x22f.google.com [IPv6:2607:f8b0:4001:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1EC281770; Thu, 6 Feb 2014 02:57:40 +0000 (UTC) Received: by mail-ie0-f175.google.com with SMTP id ar20so298059iec.6 for ; Wed, 05 Feb 2014 18:57:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=mxPSjXfZovfRc76HObnv0UGLsgvN+fl3AN0wr2DClKo=; b=f1nK/rF6fClVT57T009/n2iMNw8Ho9Wr8qEOC+SWRIy/xSQPycXIH40OYxbxFsdTg2 B3GWUyxYgH70IPyYBreLteWQgi0R1TpEqgMPggjODCGyK6LhQ9osMjsW4PG/m0fH5ph/ 719OaKhys9dUJfnYC658cHRm+upy78tIRdIUE3zeXWcAmoRt63XEGtq2/KYT1rXydQdq ICrOXwB8zwdeGQbGLIL/ZOwmYlfEC3m5r8HBYC52CbsxcezXVOpAun0LQHvfg0tQfaF2 m7DQnKKNSI25TC8YTfWOzlUXCDs4lZU+P82Fb+GRA3TOPmdCI5ggL/XoHwUIio7ddpxq vkkQ== X-Received: by 10.50.43.233 with SMTP id z9mr27520961igl.33.1391655459491; Wed, 05 Feb 2014 18:57:39 -0800 (PST) Received: from raichu (198-84-185-216.cpe.teksavvy.com. [198.84.185.216]) by mx.google.com with ESMTPSA id kt2sm62989701igb.1.2014.02.05.18.57.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Feb 2014 18:57:38 -0800 (PST) Sender: Mark Johnston Date: Wed, 5 Feb 2014 21:57:10 -0500 From: Mark Johnston To: Doug Ambrisko Subject: Re: mfi(4) support for MegaRAID Fury cards Message-ID: <20140206025710.GA77280@raichu> References: <20131227220455.GA6027@charmander.home> <20140124190832.GB28724@ambrisko.com> <20140205192136.GA71309@ambrisko.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140205192136.GA71309@ambrisko.com> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: freebsd-scsi@freebsd.org, ambrisko@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 02:57:40 -0000 On Wed, Feb 05, 2014 at 11:21:36AM -0800, Doug Ambrisko wrote: > On Fri, Jan 24, 2014 at 11:08:32AM -0800, Doug Ambrisko wrote: > | On Fri, Dec 27, 2013 at 05:04:55PM -0500, Mark Johnston wrote: > | | Hello, > | | > | | The patch here adds mfi(4) support for my LSI 9341-4i controller, which > | | has device ID 0x5f: > | | > | | http://people.freebsd.org/~markj/patches/mfi_fury.diff > | | > | | This diff was mostly obtained by going through the mrsas(4) code > | | specific to Invader (DID 0x5d) and Fury (DID 0x5f) controllers. The main > | | change is to add an end-of-list marker to scatter-gather DMA lists > | | before handing them to the firmware. Without this, large writes to an > | | mfi(4) volume result in a firmware crash loop, and the system needs to > | | be reset. The diff adds code for both Invader and Fury cards, as this is > | | what's done in mrsas(4); I haven't tested with an Invader card though, > | | as I don't have access to one. With this patch, I'm able to boot FreeBSD > | | 8.2 off of a RAID 1 volume on my 9341-4i. > | | > | | Would anyone be able to review or test this patch? I'm particularly > | | interested if anyone could try it out with an Invader or Fury card > | | (there shouldn't be any differences in driver behaviour with other > | | cards). > | > | The patch looks good. I can test it out on a Invader card that I have. > | I don't have a Fury card. I was holding off waiting to see how we > | should resolve the mrsas(4) driver from LSI conflict. We have been > | looking at what needs to be done to get mrsas(4) into FreeBSD. I > | posted a change to FreeBSD SCSI list to add a tunable to reduce > | the probe priority of mfi(4) for ThunderBolt and later cards. This > | way they can both be in the GENERIC kernel etc. and not have an > | issue. We'll need to do some minor updates to your patch to work > | with that since I added another flag in the ident area. > > After fixing the merge conflict with my recent change it works with my > Invader card. I don't see any issues with the patch. Thanks! I've committed the change as r261535. > > Do you want to redo the patch and then commit it or just commit once > you've made the change. Please make sure you do it with -current. > After this we should plan to MFC these changes all the way back to > 8-stable. Sure, sounds good. -Mark From owner-freebsd-scsi@FreeBSD.ORG Sat Feb 8 21:28:24 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 41968205; Sat, 8 Feb 2014 21:28:24 +0000 (UTC) Received: from pi.nmdps.net (pi.nmdps.net [109.61.102.5]) by mx1.freebsd.org (Postfix) with ESMTP id CB24E10D6; Sat, 8 Feb 2014 21:28:23 +0000 (UTC) Received: from pi.nmdps.net (localhost [127.0.0.1]) (Authenticated sender: krichy@cflinux.hu) by pi.nmdps.net (Postfix) with ESMTPSA id 03B7115E6; Sat, 8 Feb 2014 22:28:15 +0100 (CET) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Sat, 08 Feb 2014 22:28:13 +0100 From: krichy@cflinux.hu To: zfs-devel@freebsd.org, freebsd-scsi@freebsd.org Subject: Re: Outage related to hard drive failure Message-ID: X-Sender: krichy@cflinux.hu User-Agent: Roundcube Webmail/0.9.5 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 21:28:24 -0000 Dear Brian, Unfortunately I just can report about same issue I ran into a few weeks ago. We run a FreeNAS server which hosts the VM images, and serves them over NFS. One time I began receiving notifications that the virtual hosts served by this NAS went down. I checked the NAS, found that one drive attached to a mps/lsi HBA stopped responding to the HBA at all, and thus blocked the whole pool. That was also strange to me that neither a timeout event happened, so actually zfs thought that all drives are fine, just one blocked the whole pool IOs. And unfortunately I even could not offline that drive, only a hard reset helped. The drive was so unresponsive that the bootup for FreeNAS also took long, but at least that time zfs somehow noticed the drive is missing, and removed it from the pool. And after that the pool worked fine, of course in a degraded state, but healthy, we could initiate a replacement. I only have the logs for the bootup: mps0: mpssas_scsiio_timeout checking sc 0xffffff8000fd1000 cm 0xffffff800100c0a0 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 500 command timeout cm 0xffffff800100c0a0 ccb 0xfffffe0010f03000 mps0: mpssas_alloc_tm freezing simq mps0: timedout cm 0xffffff800100c0a0 allocated tm 0xffffff8000fe47b0 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 500 completed timedout cm 0xffffff800100c0a0 ccb 0xfffffe0010f03000 during recovery ioc 8048 scsi 0 state c xfer 0 (noperiph:mps0:0:14:0): SMID 6 abort TaskMID 500 status 0x0 code 0x0 count 1 (noperiph:mps0:0:14:0): SMID 6 finished recovery after aborting TaskMID 500 mps0: mpssas_free_tm releasing simq (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe14:mps0:0:14:0): CAM status: Command timeout (probe14:mps0:0:14:0): Retrying command mps0: mpssas_scsiio_timeout checking sc 0xffffff8000fd1000 cm 0xffffff8001032f50 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 986 command timeout cm 0xffffff8001032f50 ccb 0xfffffe0010f03000 mps0: mpssas_alloc_tm freezing simq mps0: timedout cm 0xffffff8001032f50 allocated tm 0xffffff8000fe48f8 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 986 completed timedout cm 0xffffff8001032f50 ccb 0xfffffe0010f03000 during recovery ioc 8048 scsi 0 state c xfer 0 (noperiph:mps0:0:14:0): SMID 7 abort TaskMID 986 status 0x0 code 0x0 count 1 (noperiph:mps0:0:14:0): SMID 7 finished recovery after aborting TaskMID 986 mps0: mpssas_free_tm releasing simq (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe14:mps0:0:14:0): CAM status: Command timeout (probe14:mps0:0:14:0): Retrying command mps0: mpssas_scsiio_timeout checking sc 0xffffff8000fd1000 cm 0xffffff8001010c38 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 559 command timeout cm 0xffffff8001010c38 ccb 0xfffffe0010f03000 mps0: mpssas_alloc_tm freezing simq mps0: timedout cm 0xffffff8001010c38 allocated tm 0xffffff8000fe4a40 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 559 completed timedout cm 0xffffff8001010c38 ccb 0xfffffe0010f03000 during recovery ioc 8048 scsi 0 state c xfer 0 (noperiph:mps0:0:14:0): SMID 8 abort TaskMID 559 status 0x0 code 0x0 count 1 (noperiph:mps0:0:14:0): SMID 8 finished recovery after aborting TaskMID 559 mps0: mpssas_free_tm releasing simq (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe14:mps0:0:14:0): CAM status: Command timeout (probe14:mps0:0:14:0): Retrying command mps0: mpssas_scsiio_timeout checking sc 0xffffff8000fd1000 cm 0xffffff8001007278 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 439 command timeout cm 0xffffff8001007278 ccb 0xfffffe0010f03000 mps0: mpssas_alloc_tm freezing simq mps0: timedout cm 0xffffff8001007278 allocated tm 0xffffff8000fe4b88 (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 439 completed timedout cm 0xffffff8001007278 ccb 0xfffffe0010f03000 during recovery ioc 8048 scsi 0 state c xfer 0 (noperiph:mps0:0:14:0): SMID 9 abort TaskMID 439 status 0x0 code 0x0 count 1 (noperiph:mps0:0:14:0): SMID 9 finished recovery after aborting TaskMID 439 mps0: mpssas_free_tm releasing simq (probe14:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe14:mps0:0:14:0): CAM status: Command timeout (probe14:mps0:0:14:0): Retrying command A side note is that the drive was removed from its hot-swap bay, then reinserted, and since then it is working fine. That is a seagate ST32000645SS. And no errors reported. Regards, 2014-02-08 21:02 időpontban Brian Gardner ezt írta: > Last year upgraded my production servers from ufs on adaptec raid > controllers to zfs raidz on lsi controllers. Last week I had an > outage, the culprit being a failed hard in my raidz. My log was > littered with kernel messages such as the ones below. About 20 > minutes after the first message some aspects of my host where hung, > and I was notified that my site was down. I noticed these messages in > my log but running a zfs status however showed only a few checksum > errors and the failed drive didn't get degraded. Manually degrading > the drive solved my problems. This seems very odd to me. It's almost > as if zfs wasn't getting messages from the lsi driver regarding these > read/write failures. Do I need to tune something in the mps/lsi or > zfs drivers to help it deal with failures? I'm running FreeBSD > 8.3-Release-p3 with the generic kernel, nothing unusual about my setup > other than I use jails extensively. There are four drives in the > raidz configuration in question: > > zpool status > pool: storage > state: ONLINE > scan: resilvered 67.6G in 2h21m with 0 errors on Tue Aug 6 00:44:14 > 2013 > config: > > NAME STATE READ WRITE CKSUM > storage ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > da0 ONLINE 0 0 0 > da1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > > Jan 29 19:03:18 host2 kernel: (da0:mpslsi0:0:0:0): WRITE(10). CDB: 2a > 0 9 72 d8 13 0 0 1d 0 > Jan 29 19:03:18 host2 kernel: (da0:mpslsi0:0:0:0): CAM status: SCSI > Status Error > Jan 29 19:03:18 host2 kernel: (da0:mpslsi0:0:0:0): SCSI status: Check > Condition > Jan 29 19:03:18 host2 kernel: (da0:mpslsi0:0:0:0): SCSI sense: > Deferred error: MEDIUM ERROR info:97ef13b asc:15,1 (Mechanical > positioning error) actual retry count: 15 > Jan 29 19:04:00 host2 kernel: (da0:mpslsi0:0:0:0): READ(10). CDB: 28 0 > 9 75 b9 96 0 0 b 0 > Jan 29 19:04:00 host2 kernel: (da0:mpslsi0:0:0:0): CAM status: SCSI > Status Error > Jan 29 19:04:00 host2 kernel: (da0:mpslsi0:0:0:0): SCSI status: Check > Condition > Jan 29 19:04:00 host2 kernel: (da0:mpslsi0:0:0:0): SCSI sense: MEDIUM > ERROR info:975b99b asc:15,1 (Mechanical positioning error) actual > retry count: 15 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_scsiio_timeout checking > sc 0xffffff80005ee000 cm 0xffffff800060d408 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): READ(10). CDB: 28 0 > 2 e3 6a 1a 0 0 1 0 length 512 SMID 153 command timeout cm > 0xffffff800060d408 ccb 0xffffff000a1ce800 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_alloc_tm freezing simq > Jan 29 19:04:54 host2 kernel: mpslsi0: timedout cm 0xffffff800060d408 > allocated tm 0xffffff8000604718 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): READ(10). CDB: 28 0 > 2 e3 6a 1a 0 0 1 0 length 512 SMID 153 completed timedout cm > 0xffffff800060d408 ccb 0xffffff000a1ce800 during recovery ioc 8048 > scsi 0 state c xfer 0 > Jan 29 19:04:54 host2 kernel: (noperiph:mpslsi0:0:0:0): SMID 43 abort > TaskMID 153 status 0x0 code 0x0 count 1 > Jan 29 19:04:54 host2 kernel: (noperiph:mpslsi0:0:0:0): SMID 43 > finished recovery after aborting TaskMID 153 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_free_tm releasing simq > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_scsiio_timeout checking > sc 0xffffff80005ee000 cm 0xffffff800060d928 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): READ(10). CDB: 28 0 > 9 41 c7 53 0 0 b 0 length 5632 SMID 157 command timeout cm > 0xffffff800060d928 ccb 0xffffff0176ccb000 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_alloc_tm freezing simq > Jan 29 19:04:54 host2 kernel: mpslsi0: timedout cm 0xffffff800060d928 > allocated tm 0xffffff8000604860 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): READ(10). CDB: 28 0 > 9 41 c7 53 0 0 b 0 length 5632 SMID 157 completed timedout cm > 0xffffff800060d928 ccb 0xffffff0176ccb000 during recovery ioc 8048 > scsi 0 state c xfer 0(noperiph:mpslsi0:0:0:0): SMID 44 abort TaskMID > 157 status 0x0 code 0x0 count 1 > Jan 29 19:04:54 host2 kernel: (noperiph:mpslsi0:0:0:0): SMID 44 > finished recovery after aborting TaskMID 157 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_free_tm releasing simq > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_scsiio_timeout checking > sc 0xffffff80005ee000 cm 0xffffff80006235a8 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): WRITE(10). CDB: 2a > 0 9 72 e0 d3 0 1 0 0 length 131072 SMID 429 command timeout cm > 0xffffff80006235a8 ccb 0xffffff000c600000 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_alloc_tm freezing simq > Jan 29 19:04:54 host2 kernel: mpslsi0: timedout cm 0xffffff80006235a8 > allocated tm 0xffffff80006049a8 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_scsiio_timeout checking > sc 0xffffff80005ee000 cm 0xffffff800063e2e0 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): WRITE(10). CDB: 2a > 0 9 72 e1 d3 0 0 3 0 length 1536 SMID 764 command timeout cm > 0xffffff800063e2e0 ccb 0xffffff01dfa19800 > Jan 29 19:04:54 host2 kernel: mpslsi0: queued timedout cm > 0xffffff800063e2e0 for processing by tm 0xffffff80006049a8 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): WRITE(10). CDB: 2a > 0 9 72 e0 d3 0 1 0 0 length 131072 SMID 429 completed timedout cm > 0xffffff80006235a8 ccb 0xffffff000c600000 during recovery ioc 8048 > scsi 0 state c xfe(noperiph:mpslsi0:0:0:0): SMID 45 abort TaskMID 429 > status 0x0 code 0x0 count 1 > Jan 29 19:04:54 host2 kernel: (noperiph:mpslsi0:0:0:0): SMID 45 > continuing recovery after aborting TaskMID 429 > Jan 29 19:04:54 host2 kernel: mpslsi0: mpssas_scsiio_timeout checking > sc 0xffffff80005ee000 cm 0xffffff8000610890 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): WRITE(10). CDB: 2a > 0 9 72 e3 31 0 0 be 0 length 97280 SMID 194 command timeout cm > 0xffffff8000610890 ccb 0xffffff013d0f3800 > Jan 29 19:04:54 host2 kernel: mpslsi0: queued timedout cm > 0xffffff8000610890 for processing by tm 0xffffff80006049a8 > Jan 29 19:04:54 host2 kernel: (da0:mpslsi0:0:0:0): WRITE(10). CDB: 2a > 0 9 72 e1 d3 0 0 3 0 length 1536 SMID 764 completed timedout cm > 0xffffff800063e2e0 ccb 0xffffff01dfa19800 during recovery ioc 8048 > scsi 0 state c xfer (noperiph:mpslsi0:0:0:0): SMID 45 abort TaskMID > 764 status 0x0 code 0x0 count 1 > _______________________________________________ > zfs-devel@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/zfs-devel > To unsubscribe, send any mail to "zfs-devel-unsubscribe@freebsd.org"