Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Dec 2013 09:56:28 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>, peter@holm.cc
Cc:        freebsd-fs <freebsd-fs@FreeBSD.org>
Subject:   Re: namecache: numneg > 0 but ncneg is empty
Message-ID:  <52B2A6AC.3070902@FreeBSD.org>
In-Reply-To: <20131219070350.GM59496@kib.kiev.ua>
References:  <52B16847.8090905@FreeBSD.org> <20131219070350.GM59496@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
on 19/12/2013 09:03 Konstantin Belousov said the following:
> On Wed, Dec 18, 2013 at 11:17:59AM +0200, Andriy Gapon wrote:
>>
>> I've been running a test that exercises vfs, fs and namecache code quite a lot
>> and I have run into the following panic:
[snip]
>> (kgdb) fr 8
>> #8  0xffffffff8097c22f in cache_enter_time (dvp=0xfffffe031c7215f8,
>> vp=0xfffffe0a684f05f8, cnp=0xffffff9de1875858, tsp=0x0, dtsp=0x0) at
>> /usr/src/sys/kern/vfs_cache.c:902
>> 902                     cache_zap(ncp);
>> (kgdb) list
>> 897                     zap = 1;
>> 898             }
>> 899             if (hold)
>> 900                     vhold(dvp);
>> 901             if (zap)
>> 902                     cache_zap(ncp);
>> 903             CACHE_WUNLOCK();
>> 904     }
>> 905
>> 906     /*
>> (kgdb) i loc
>> ncp = (struct namecache *) 0x0
>> n2 = (struct namecache *) 0xffffffff8178a740
>> ncpp = (struct nchashhead *) 0xffffff8ccde4e9b0
>> hash = <value optimized out>
>> flag = 0
>> hold = 1
>> zap = 1
>> len = <value optimized out>
>>
>> (kgdb) p numneg
>> $4 = 437
>> (kgdb) p ncp
>> $7 = (struct namecache *) 0x0
>> (kgdb) p ncneg
>> $8 = {tqh_first = 0x0, tqh_last = 0xffffffff8178a710}
>>
>>
>> I am not sure that there is a bug in namecache, but if there is one, then the
>> only suspicious place I could find is ".." handling in cache_enter_time().
>>
> 
> Do you mean that numneg accounting is wrong for the case when the
> existing ncp retargeted for dd ? This is the only issue I see there, but
> it looks as the real case for the failure.

Yes, this was the case that I suspected.

> Testcase would be lot of lookups down the long directory hierarchy, and
> than walking back through the ".." entries.  Even if the thing does not
> panic, the resulting length of the ncneg tailq should be strictly less
> than the numneg.

Kostik,

thank you for the patch!  I will test it in my environment.

Peter,

I am curious about what ideology is behind vfs testing in stress2.  I know that
I can just look at the code myself, but hope that asking you could be faster.
Does stress2 exercise a certain set of scenarios?  Or does it have an element of
randomness?

The reason I am asking is that I have found fsstress (xfsstress) insufficient
for finding all the corner cases.  I wrote a really simple script that just
performs random operations like creating, unlinking, renaming, etc a file or
directory using randomly generated paths (with certain constraints).  Running a
hundred instances of that script on the same hierarchy is surprisingly effective
at uncovering bugs that are very hard to reproduce otherwise.
So, I am wondering if I've just duplicated what you already had.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52B2A6AC.3070902>