Date: Mon, 14 Nov 2016 13:33:27 -0800 (PST) From: Don Lewis <truckman@FreeBSD.org> To: mjguzik@gmail.com Cc: freebsd-current@FreeBSD.org, mjg@FreeBSD.org Subject: Re: panic: mutex ncnegl not owned at /usr/src/sys/kern/vfs_cache.c:743 in 12.0-CURRENT r308447 Message-ID: <201611142133.uAELXRpb087611@gw.catspoiler.org> In-Reply-To: <20161113084054.GB14804@dft-labs.eu>
next in thread | previous in thread | raw e-mail | index | archive | help
On 13 Nov, Mateusz Guzik wrote: > On Sat, Nov 12, 2016 at 05:11:57PM -0800, Don Lewis wrote: >> What protects nc_flag, the lock for the list that it resides on? >> > > It is supposed to be the hot list lock and I think this uncovers a bug > here. > > Consider a NCF_HOTNEGATIVE entry which is being evicted. It sets the > NCV_DVDROP flag without the lock held, but the entry is still not > removed from negative lists. So in principle we can either lose the > newly set flag or the information that hotnegative is unset. > > That said, I think the fix would be to remove from negative entries > prior to setting the NCV_DVDROP flag. > > Normally the flag is protected by the hotlist lock. > > Untested, but should do the trick: > > --- vfs_cache.c.old 2016-11-13 09:37:50.096000000 +0100 > +++ vfs_cache.c.new 2016-11-13 09:39:45.004000000 +0100 > @@ -868,6 +868,13 @@ > nc_get_name(ncp), ncp->nc_neghits); > } > LIST_REMOVE(ncp, nc_hash); > + if (!(ncp->nc_flag & NCF_NEGATIVE)) { > + TAILQ_REMOVE(&ncp->nc_vp->v_cache_dst, ncp, nc_dst); > + if (ncp == ncp->nc_vp->v_cache_dd) > + ncp->nc_vp->v_cache_dd = NULL; > + } else { > + cache_negative_remove(ncp, neg_locked); > + } > if (ncp->nc_flag & NCF_ISDOTDOT) { > if (ncp == ncp->nc_dvp->v_cache_dd) > ncp->nc_dvp->v_cache_dd = NULL; > @@ -878,13 +885,6 @@ > atomic_subtract_rel_long(&numcachehv, 1); > } > } > - if (!(ncp->nc_flag & NCF_NEGATIVE)) { > - TAILQ_REMOVE(&ncp->nc_vp->v_cache_dst, ncp, nc_dst); > - if (ncp == ncp->nc_vp->v_cache_dd) > - ncp->nc_vp->v_cache_dd = NULL; > - } else { > - cache_negative_remove(ncp, neg_locked); > - } > atomic_subtract_rel_long(&numcache, 1); > } No obvious regressions with this patch along with a couple of the assertions that I had previously added. It survived a 14 hour poudriere run building my default package set for FreeBSD 10 i386, which paniced several times earlier.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201611142133.uAELXRpb087611>