Date: Tue, 16 Apr 2019 17:04:46 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 226510] panic: Re-refing for reason 5, cnt = 1 Message-ID: <bug-226510-227-DlZeA5h0cF@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-226510-227@https.bugs.freebsd.org/bugzilla/> References: <bug-226510-227@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D226510 Eric van Gyzen <vangyzen@FreeBSD.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |vangyzen@FreeBSD.org --- Comment #21 from Eric van Gyzen <vangyzen@FreeBSD.org> --- For the record, here are the commits that seem to be related to this PR, including followup commits to fix regressions. I'm recording them here just because I had to find them for myself. commit 02d268dd2672cab6c99d55edc230623ab60acf3f Author: imp <imp@FreeBSD.org> Date: Mon Mar 12 15:17:16 2018 +0000 Tighten up periph lock to avoid some races Make sure the periph lock is held around rmw access to softc data, espeically flags, including work flags in iosched. Add asserts for the periph lock where it should be held. PR: 226510 Sponsored by: Netflix Differential Review: https://reviews.freebsd.org/D14456 Notes (freebsd): svn path=3D/head/; revision=3D330796 commit bf523f13ef5a3a6d06e76be0df100ac13b0d1d11 Author: imp <imp@FreeBSD.org> Date: Sat Mar 17 16:04:06 2018 +0000 Only take out the periph lock when we're modifying the flags of the softc for an async unit attention. CAM locks, sometimes, the periph lock and other times does not. We were taking the lock always and running into lock recursion issues on a non-recursive lock. Now we take it selectively. It's not clear why xpt takes the lock selectively before calling us, though, and that's still under investigation. Reported by: avg PR: 226510 (same panic, differnt circumstances) Sponsored by: Netflix Notes (freebsd): svn path=3D/head/; revision=3D331097 commit c2ed5522d0e7837332d4dfcad73179f6f0df45c2 Author: imp <imp@FreeBSD.org> Date: Tue Mar 20 22:07:45 2018 +0000 Release the "TUR" reference when clearing the TUR work flag. We mostly do this right, except when there's no BP and we do a TUR by request. In that case, we clear the flag, but don't release the reference, leaking the reference on rare occasion. PR: 226510 Sponsored by: Netflix Notes (freebsd): svn path=3D/head/; revision=3D331273 commit 20eb8298f5923ea3ab2734cd24f8ee0f12cf8b98 Author: imp <imp@FreeBSD.org> Date: Wed Mar 21 12:55:59 2018 +0000 Revert r331273: "Release the "TUR" reference when clearing the TUR work flag. We mostly" It exposes other issues, so revert to the pervious state of known issue= s. Notes (freebsd): svn path=3D/head/; revision=3D331291 commit 685a9276f2ecb16a977f044eda1490e1f243a043 Author: imp <imp@FreeBSD.org> Date: Fri Mar 23 16:23:15 2018 +0000 Flag when we have a pending TUR. Don't schedule another one when we have one pending. Otherwise, we can race and send two, which is wasteful in close proximity. It can also cause the acaquire/release count for TUR to be > 1, which is undexpected. PR: 226510 Differential Review: https://reviews.freebsd.org/D14792 Notes (freebsd): svn path=3D/head/; revision=3D331435 commit 896df23a52b2a955b338a931ea514c44aec48cba Author: ken <ken@FreeBSD.org> Date: Thu Jun 14 17:08:44 2018 +0000 Fix da(4) locking when probing SMR drives. Probing host aware and host managed SMR drives got broken in revision 330796. The added cam_periph_lock() calls were in areas in dadone() where the peripheral lock was already held. Since then, dadone() has been split into separate functions that are dedicated to each probe state. The result is that when probing a host aware drive, I ran into a recurs= ive lock acquisition in dadone_probeatalogdir(). I would have run into the same problem in dadone_probeataiddir(), and in dadone_probeatasup() and dadone_probeatazone() in the error paths had the probe continued. The solution is to take out all of the extra cam_periph_lock() calls. I also added cam_periph_assert(periph, MA_OWNED) near the top of each of the dadone_* calls. These make it clear to anyone coming along in the the future that the lock is held in the probe done functions. Also add a locking assert in daprobedone(), to make it clear that it mu= st be called with the periph lock held. Sponsored by: Spectra Logic Differential Revision: https://reviews.freebsd.org/D15764 Notes (freebsd): svn path=3D/head/; revision=3D335154 commit 92253110610c28fc34b45c0c6894294395f480bd Author: imp <imp@FreeBSD.org> Date: Mon Nov 5 18:47:29 2018 +0000 Only assert locked for many async events. Many async events that we see are called for this specific path. When calling an async callback for a targetted device, XTP will lock that specific device's path lock (same as what cam_periph_lock does). For those AC_ events, assert we have the lock rather than trying to recusrively take it (which causes panics since it's not recursive). Add annotations about this and about the fact that AC_SCSI_AEN events are generated now only in the ata stack (which cannot have a scsi_da attachment). Leave it in place in case I've overlooked something as the code is harmless. This is fallout from my attempts to "fix" locking for softc->flags in r330796 that's not been triggered often enough to get my attention until now. Sponsored by: Netflix MFC After: 3 days Differential Revision: https://reviews.freebsd.org/D17837 Notes (freebsd): svn path=3D/head/; revision=3D340155 --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-226510-227-DlZeA5h0cF>