From owner-freebsd-fs@FreeBSD.ORG Thu Dec 19 08:19:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 33694DC7 for ; Thu, 19 Dec 2013 08:19:02 +0000 (UTC) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id BEF39178B for ; Thu, 19 Dec 2013 08:19:01 +0000 (UTC) Received: (qmail 5382 invoked from network); 19 Dec 2013 08:12:19 -0000 Received: from 87.58.146.155 (HELO x2.osted.lan) (87.58.146.155) by relay03.pair.com with SMTP; 19 Dec 2013 08:12:19 -0000 X-pair-Authenticated: 87.58.146.155 Received: from x2.osted.lan (localhost [127.0.0.1]) by x2.osted.lan (8.14.5/8.14.5) with ESMTP id rBJ8CI9e012886; Thu, 19 Dec 2013 09:12:18 +0100 (CET) (envelope-from pho@x2.osted.lan) Received: (from pho@localhost) by x2.osted.lan (8.14.5/8.14.5/Submit) id rBJ8CIfs012885; Thu, 19 Dec 2013 09:12:18 +0100 (CET) (envelope-from pho) Date: Thu, 19 Dec 2013 09:12:18 +0100 From: Peter Holm To: Andriy Gapon Subject: Re: namecache: numneg > 0 but ncneg is empty Message-ID: <20131219081218.GA12747@x2.osted.lan> References: <52B16847.8090905@FreeBSD.org> <20131219070350.GM59496@kib.kiev.ua> <52B2A6AC.3070902@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52B2A6AC.3070902@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Dec 2013 08:19:02 -0000 On Thu, Dec 19, 2013 at 09:56:28AM +0200, Andriy Gapon wrote: > on 19/12/2013 09:03 Konstantin Belousov said the following: > > On Wed, Dec 18, 2013 at 11:17:59AM +0200, Andriy Gapon wrote: > >> > >> I've been running a test that exercises vfs, fs and namecache code quite a lot > >> and I have run into the following panic: > [snip] > >> (kgdb) fr 8 > >> #8 0xffffffff8097c22f in cache_enter_time (dvp=0xfffffe031c7215f8, > >> vp=0xfffffe0a684f05f8, cnp=0xffffff9de1875858, tsp=0x0, dtsp=0x0) at > >> /usr/src/sys/kern/vfs_cache.c:902 > >> 902 cache_zap(ncp); > >> (kgdb) list > >> 897 zap = 1; > >> 898 } > >> 899 if (hold) > >> 900 vhold(dvp); > >> 901 if (zap) > >> 902 cache_zap(ncp); > >> 903 CACHE_WUNLOCK(); > >> 904 } > >> 905 > >> 906 /* > >> (kgdb) i loc > >> ncp = (struct namecache *) 0x0 > >> n2 = (struct namecache *) 0xffffffff8178a740 > >> ncpp = (struct nchashhead *) 0xffffff8ccde4e9b0 > >> hash = > >> flag = 0 > >> hold = 1 > >> zap = 1 > >> len = > >> > >> (kgdb) p numneg > >> $4 = 437 > >> (kgdb) p ncp > >> $7 = (struct namecache *) 0x0 > >> (kgdb) p ncneg > >> $8 = {tqh_first = 0x0, tqh_last = 0xffffffff8178a710} > >> > >> > >> I am not sure that there is a bug in namecache, but if there is one, then the > >> only suspicious place I could find is ".." handling in cache_enter_time(). > >> > > > > Do you mean that numneg accounting is wrong for the case when the > > existing ncp retargeted for dd ? This is the only issue I see there, but > > it looks as the real case for the failure. > > Yes, this was the case that I suspected. > > > Testcase would be lot of lookups down the long directory hierarchy, and > > than walking back through the ".." entries. Even if the thing does not > > panic, the resulting length of the ncneg tailq should be strictly less > > than the numneg. > > Kostik, > > thank you for the patch! I will test it in my environment. > > Peter, > > I am curious about what ideology is behind vfs testing in stress2. I know that > I can just look at the code myself, but hope that asking you could be faster. > Does stress2 exercise a certain set of scenarios? Or does it have an element of > randomness? > The tests found in stress2/testcases does everything in a random fashion. Test found in stress2/misc are for the most part scenarios that has been used for finding specific problems. > The reason I am asking is that I have found fsstress (xfsstress) insufficient > for finding all the corner cases. I wrote a really simple script that just > performs random operations like creating, unlinking, renaming, etc a file or > directory using randomly generated paths (with certain constraints). Running a > hundred instances of that script on the same hierarchy is surprisingly effective > at uncovering bugs that are very hard to reproduce otherwise. > So, I am wondering if I've just duplicated what you already had. > > -- > Andriy Gapon -- Peter