From owner-freebsd-fs@FreeBSD.ORG Thu Dec 19 07:57:19 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E6CAC5B6 for ; Thu, 19 Dec 2013 07:57:19 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 04E5015F5 for ; Thu, 19 Dec 2013 07:57:18 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA04275; Thu, 19 Dec 2013 09:57:07 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VtYTi-000PtQ-P7; Thu, 19 Dec 2013 09:57:06 +0200 Message-ID: <52B2A6AC.3070902@FreeBSD.org> Date: Thu, 19 Dec 2013 09:56:28 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Konstantin Belousov , peter@holm.cc Subject: Re: namecache: numneg > 0 but ncneg is empty References: <52B16847.8090905@FreeBSD.org> <20131219070350.GM59496@kib.kiev.ua> In-Reply-To: <20131219070350.GM59496@kib.kiev.ua> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Dec 2013 07:57:20 -0000 on 19/12/2013 09:03 Konstantin Belousov said the following: > On Wed, Dec 18, 2013 at 11:17:59AM +0200, Andriy Gapon wrote: >> >> I've been running a test that exercises vfs, fs and namecache code quite a lot >> and I have run into the following panic: [snip] >> (kgdb) fr 8 >> #8 0xffffffff8097c22f in cache_enter_time (dvp=0xfffffe031c7215f8, >> vp=0xfffffe0a684f05f8, cnp=0xffffff9de1875858, tsp=0x0, dtsp=0x0) at >> /usr/src/sys/kern/vfs_cache.c:902 >> 902 cache_zap(ncp); >> (kgdb) list >> 897 zap = 1; >> 898 } >> 899 if (hold) >> 900 vhold(dvp); >> 901 if (zap) >> 902 cache_zap(ncp); >> 903 CACHE_WUNLOCK(); >> 904 } >> 905 >> 906 /* >> (kgdb) i loc >> ncp = (struct namecache *) 0x0 >> n2 = (struct namecache *) 0xffffffff8178a740 >> ncpp = (struct nchashhead *) 0xffffff8ccde4e9b0 >> hash = >> flag = 0 >> hold = 1 >> zap = 1 >> len = >> >> (kgdb) p numneg >> $4 = 437 >> (kgdb) p ncp >> $7 = (struct namecache *) 0x0 >> (kgdb) p ncneg >> $8 = {tqh_first = 0x0, tqh_last = 0xffffffff8178a710} >> >> >> I am not sure that there is a bug in namecache, but if there is one, then the >> only suspicious place I could find is ".." handling in cache_enter_time(). >> > > Do you mean that numneg accounting is wrong for the case when the > existing ncp retargeted for dd ? This is the only issue I see there, but > it looks as the real case for the failure. Yes, this was the case that I suspected. > Testcase would be lot of lookups down the long directory hierarchy, and > than walking back through the ".." entries. Even if the thing does not > panic, the resulting length of the ncneg tailq should be strictly less > than the numneg. Kostik, thank you for the patch! I will test it in my environment. Peter, I am curious about what ideology is behind vfs testing in stress2. I know that I can just look at the code myself, but hope that asking you could be faster. Does stress2 exercise a certain set of scenarios? Or does it have an element of randomness? The reason I am asking is that I have found fsstress (xfsstress) insufficient for finding all the corner cases. I wrote a really simple script that just performs random operations like creating, unlinking, renaming, etc a file or directory using randomly generated paths (with certain constraints). Running a hundred instances of that script on the same hierarchy is surprisingly effective at uncovering bugs that are very hard to reproduce otherwise. So, I am wondering if I've just duplicated what you already had. -- Andriy Gapon