From owner-freebsd-current@FreeBSD.ORG Mon Feb 3 16:32:30 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 730EB48C; Mon, 3 Feb 2014 16:32:30 +0000 (UTC) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4F272196A; Mon, 3 Feb 2014 16:32:30 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id s13GWJwA013452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 3 Feb 2014 08:32:19 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id s13GWJsi013451; Mon, 3 Feb 2014 08:32:19 -0800 (PST) (envelope-from sgk) Date: Mon, 3 Feb 2014 08:32:19 -0800 From: Steve Kargl To: John Baldwin Subject: Re: Instant panic CAM or USB subsystem Message-ID: <20140203163219.GA13386@troutmask.apl.washington.edu> References: <20140125172106.GA67590@troutmask.apl.washington.edu> <201401281232.21958.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201401281232.21958.jhb@freebsd.org> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: Alexander Motin , freebsd-current@freebsd.org, scsi@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 16:32:30 -0000 On Tue, Jan 28, 2014 at 12:32:21PM -0500, John Baldwin wrote: > On Saturday, January 25, 2014 12:21:06 pm Steve Kargl wrote: > > If I plug my Samsung Intensity II cellphone into a usb port, > > I get an instant panic. This is 100% reproducible. I have > > the core and kernel for further debugging. Dmesg.boot follows > > my sig. > > > > % kgdb /boot/kernel/kernel /vmcore.0 > > > > Unread portion of the kernel message buffer: > > cd1 at umass-sim1 bus 1 scbus4 target 0 lun 0 > > cd1: Removable CD-ROM SCSI-2 device > > cd1: Serial Number 000000000002 > > cd1: 1.000MB/s transfers > > cd1: cd present [3840000 x 512 byte records] > > cd1: quirks=0x10<10_BYTE_ONLY> > > panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301 > > cpuid = 0 > > KDB: enter: panic > > scsi@ might work better for this. It looks like when cdasync() calls > cam_periph_alloc() it doesn't have its associated xpt_path locked. All the > other async xpt callbacks I looked at don't lock the xpt path either. It > seems they expect it to be locked by the caller when they are invoked. It > seems xpt_async_process_dev() doesn't always lock xpt_lock, but sometimes > locks the device instead: > > /* > * If async for specific device is to be delivered to > * the wildcard client, take the specific device lock. > * XXX: We may need a way for client to specify it. > */ > if ((device->lun_id == CAM_LUN_WILDCARD && > path->device->lun_id != CAM_LUN_WILDCARD) || > (device->target->target_id == CAM_TARGET_WILDCARD && > path->target->target_id != CAM_TARGET_WILDCARD) || > (device->target->bus->path_id == CAM_BUS_WILDCARD && > path->target->bus->path_id != CAM_BUS_WILDCARD)) { > mtx_unlock(&device->device_mtx); > xpt_path_lock(path); > relock = 1; > } else > relock = 0; > > (*(device->target->bus->xport->async))(async_code, > device->target->bus, device->target, device, async_arg); > xpt_async_bcast(&device->asyncs, async_code, path, async_arg); > > if (relock) { > xpt_path_unlock(path); > mtx_lock(&device->device_mtx); > } > > Maybe try going up to this frame (16) in your dump and do > 'p *device->target'? However, someone with more CAM knowledge needs to look > at this to see what is actually broken. > I finally have time to look at this again. Here's kgdb for frame 16 as you suggested and then frame 17. Script started on Mon Feb 3 08:16:32 2014 % kgdb /dsk1/obj/usr/src/sys/MOBILE/kernel.debug vmcore.0 Unread portion of the kernel message buffer: panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301 cpuid = 1 KDB: enter: panic #16 0xc047d6a5 in xpt_async_process_dev (device=, arg=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4208 #17 0xc047b346 in xpt_async_process (periph=0x0, ccb=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4173 #18 0xc047bd15 in xpt_done_process (ccb_h=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:5249 #19 0xc047ef14 in xpt_done_td (arg=) at /usr/src/sys/cam/cam_xpt.c:5276 #20 0xc0723daf in fork_exit (callout=0xc047edb0 ) at /usr/src/sys/kern/kern_fork.c:977 #21 0xc09fb3e4 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:278 Current language: auto; currently minimal (kgdb) frame 16 #16 0xc047d6a5 in xpt_async_process_dev (device=, arg=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4208 4208 cur_entry->callback(cur_entry->callback_arg, (kgdb) p *device Cannot access memory at address 0x0 (kgdb) up 1 #17 0xc047b346 in xpt_async_process (periph=0x0, ccb=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4173 4173 xpt_async_process_dev(xpt_periph->path->device, ccb); (kgdb) p *xpt_periph->path->device->target $2 = {ed_entries = {tqh_first = 0xc6f4b800, tqh_last = 0xc6f4b80c}, links = { tqe_next = 0x0, tqe_prev = 0xc6eaaa00}, bus = 0xc6eaaa00, target_id = 4294967295, refcount = 2, generation = 1, last_reset = { tv_sec = 0, tv_usec = 0}, rpl_size = 0, luns = 0x0, luns_mtx = { lock_object = {lo_name = 0xc0a3f9bc "CAM LUNs lock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}} (kgdb) p *xpt_periph->path->device->target->bus $3 = {et_entries = {tqh_first = 0xc6eaa980, tqh_last = 0xc6eaa988}, links = { tqe_next = 0x0, tqe_prev = 0xc7690008}, path_id = 4294967295, sim = 0xc6eaaa80, last_reset = {tv_sec = 0, tv_usec = 0}, flags = 0, refcount = 3, generation = 3, parent_dev = 0x0, xport = 0xc0b2f568, eb_mtx = {lock_object = {lo_name = 0xc0a3f85a "CAM bus lock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}} (kgdb) quit % exit exit Script done on Mon Feb 3 08:20:44 2014 -- Steve