Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Nov 2016 13:33:27 -0800 (PST)
From:      Don Lewis <truckman@FreeBSD.org>
To:        mjguzik@gmail.com
Cc:        freebsd-current@FreeBSD.org, mjg@FreeBSD.org
Subject:   Re: panic: mutex ncnegl not owned at /usr/src/sys/kern/vfs_cache.c:743 in 12.0-CURRENT r308447
Message-ID:  <201611142133.uAELXRpb087611@gw.catspoiler.org>
In-Reply-To: <20161113084054.GB14804@dft-labs.eu>

next in thread | previous in thread | raw e-mail | index | archive | help
On 13 Nov, Mateusz Guzik wrote:
> On Sat, Nov 12, 2016 at 05:11:57PM -0800, Don Lewis wrote:

>> What protects nc_flag, the lock for the list that it resides on?
>> 
> 
> It is supposed to be the hot list lock and I think this uncovers a bug
> here.
> 
> Consider a NCF_HOTNEGATIVE entry which is being evicted. It sets the
> NCV_DVDROP flag without the lock held, but the entry is still not
> removed from negative lists. So in principle we can either lose the
> newly set flag or the information that hotnegative is unset.
> 
> That said, I think the fix would be to remove from negative entries
> prior to setting the NCV_DVDROP flag.
> 
> Normally the flag is protected by the hotlist lock.
> 
> Untested, but should do the trick:
> 
> --- vfs_cache.c.old	2016-11-13 09:37:50.096000000 +0100
> +++ vfs_cache.c.new	2016-11-13 09:39:45.004000000 +0100
> @@ -868,6 +868,13 @@
>  		    nc_get_name(ncp), ncp->nc_neghits);
>  	}
>  	LIST_REMOVE(ncp, nc_hash);
> +	if (!(ncp->nc_flag & NCF_NEGATIVE)) {
> +		TAILQ_REMOVE(&ncp->nc_vp->v_cache_dst, ncp, nc_dst);
> +		if (ncp == ncp->nc_vp->v_cache_dd)
> +			ncp->nc_vp->v_cache_dd = NULL;
> +	} else {
> +		cache_negative_remove(ncp, neg_locked);
> +	}
>  	if (ncp->nc_flag & NCF_ISDOTDOT) {
>  		if (ncp == ncp->nc_dvp->v_cache_dd)
>  			ncp->nc_dvp->v_cache_dd = NULL;
> @@ -878,13 +885,6 @@
>  			atomic_subtract_rel_long(&numcachehv, 1);
>  		}
>  	}
> -	if (!(ncp->nc_flag & NCF_NEGATIVE)) {
> -		TAILQ_REMOVE(&ncp->nc_vp->v_cache_dst, ncp, nc_dst);
> -		if (ncp == ncp->nc_vp->v_cache_dd)
> -			ncp->nc_vp->v_cache_dd = NULL;
> -	} else {
> -		cache_negative_remove(ncp, neg_locked);
> -	}
>  	atomic_subtract_rel_long(&numcache, 1);
>  }

No obvious regressions with this patch along with a couple of the
assertions that I had previously added.  It survived a 14 hour poudriere
run building my default package set for FreeBSD 10 i386, which paniced
several times earlier.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201611142133.uAELXRpb087611>