From owner-freebsd-current@FreeBSD.ORG Wed May 8 00:30:08 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7F7F62B2 for ; Wed, 8 May 2013 00:30:08 +0000 (UTC) (envelope-from rmtodd@servalan.servalan.com) Received: from ln.servalan.com (unknown [IPv6:2600:3c00::f03c:91ff:fe96:62f5]) by mx1.freebsd.org (Postfix) with ESMTP id 5FEB9A7F for ; Wed, 8 May 2013 00:30:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=servalan.com; s=rsadkim; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:From:References:Subject:To; bh=Nh3sOnhnVuB93076/KrEH8D5ZdG1IhNFwIZOJqm3inI=; b=rYrcD4SVFKqto4id2Xv/syi4bswEJGYupNzIyHs0W6NQKQOcoHF38iJph0Clvjahvq++rN35S4z8YgEQ22ooYUa5EZau5vB/2tDc1qlLjdQ3u+4ykPKlB85ZFZd9XC7Dwb0AMPs6JX5vuKm21pmqK6Ghh25WJlSKCwMOdHIIQmQ=; Received: from uucp by ln.servalan.com with local-rmail (Exim 4.71) (envelope-from ) id 1UZsGl-0006nD-GX for freebsd-current@freebsd.org; Tue, 07 May 2013 19:30:07 -0500 Received: from rmtodd by servalan.servalan.com with local (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1UZs7z-0001kC-54; Tue, 07 May 2013 19:21:03 -0500 To: freebsd-current@freebsd.org Subject: Re: Problem with firewire disks with recent -CURRENT. References: From: Richard Todd Date: Tue, 07 May 2013 19:21:02 -0500 In-Reply-To: (rmtodd@servalan.servalan.com's message of "Sun, 05 May 2013 13:31:05 -0500") Message-ID: User-Agent: Gnus/5.1008 (Gnus v5.10.8) XEmacs/21.5-b28 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 May 2013 00:30:08 -0000 rmtodd@servalan.servalan.com writes: > Tried upgrading one of my machines to -CURRENT yesterday and got the > following panic when the sbp code did its probing of all the firewire > devices: > panic: mutex sbp not owned at /usr/src/sys/cam/cam_xpt.c:4549 > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff81fe6837f0 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff81fe6838a0 > vpanic() at vpanic+0x126/frame 0xffffff81fe6838e0 > panic() at panic+0x43/frame 0xffffff81fe683940 > __mtx_assert() at __mtx_assert+0xc2/frame 0xffffff81fe683950 > xpt_compile_path() at xpt_compile_path+0xa1/frame 0xffffff81fe6839a0 > xpt_create_path() at xpt_create_path+0x5b/frame 0xffffff81fe6839f0 > sbp_do_attach() at sbp_do_attach+0xe8/frame 0xffffff81fe683a30 I did some further poking around in the source code trying to figure out what went on here. Looks to me like in the current version of xpt_find_target() (called by xpt_compile_path() and hence, indirectly, by xpt_create_path() ) the code expects the SIM's mutex to be owned, but apparently the call from the sbp_do_attach happens without the SIM mutex being locked. I tried hacking together the following patch and the resulting kernel comes up and lets the system properly detect the drives and do I/O to them. I don't know enough about the CAM system and its locking to know if this patch is the Right Thing to do here, though. diff -r 96ce948dd944 sys/dev/firewire/sbp.c --- a/sys/dev/firewire/sbp.c Sat May 04 17:23:33 2013 -0500 +++ b/sys/dev/firewire/sbp.c Tue May 07 19:17:28 2013 -0500 @@ -1085,10 +1085,13 @@ END_DEBUG sbp_xfer_free(xfer); - if (sdev->path == NULL) + if (sdev->path == NULL) { + CAM_SIM_LOCK(target->sbp->sim); xpt_create_path(&sdev->path, NULL, cam_sim_path(target->sbp->sim), target->target_id, sdev->lun_id); + CAM_SIM_UNLOCK(target->sbp->sim); + } /* * Let CAM scan the bus if we are in the boot process.