From owner-freebsd-fs@freebsd.org Wed Jan 13 15:40:28 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2AFDDA81D64 for ; Wed, 13 Jan 2016 15:40:28 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id D3BF71E55 for ; Wed, 13 Jan 2016 15:40:27 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:ECJCehLQZ/wpRAgkQtmcpTZWNBhigK39O0sv0rFitYgULPzxwZ3uMQTl6Ol3ixeRBMOAu6wC27Wd7fGocFdDyKjCmUhKSIZLWR4BhJdetC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TWM5DIfUi/yKRBybrysXNWC0YLniKvsqtX6WEZhunmUWftKNhK4rAHc5IE9oLBJDeIP8CbPuWZCYO9MxGlldhq5lhf44dqsrtY4q3wD86Fpy8kVcqL8ZLgxS6BZCnwMPmQy+dbsq1GXTgyU+nofWGgSuhVNCgnBqhr9W8GinDH9s79H2SKZdej/RrMwVDHqu71uQRTrjCoCHyM+/3zajtRwyqlS9kHy7ydjypLZNdnGfMF1ebnQKIsX X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CuBACdbpZW/61jaINehAxtBohUsy6BZBgKhSNKAoFwEgEBAQEBAQEBgQmCLYIHAQEBAwEBAQEgBCcgCwULAgEIDgoCAg0ZAgInAQkmAgQIBwQBHASIBQgOsAWQOgEBAQEBAQQBAQEBAQEZBIEBhVWEf4Q3AQEcgx+BSQWOOIhdhUOFK4RKhESIXkSKH4NxAikJMoIRHIF7IDQHhFM6gQgBAQE X-IronPort-AV: E=Sophos;i="5.22,289,1449550800"; d="scan'208";a="261005595" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 13 Jan 2016 10:40:18 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 637DB15F565; Wed, 13 Jan 2016 10:40:18 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 2iDgMN0Tgr4d; Wed, 13 Jan 2016 10:40:17 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 5BDA015F56D; Wed, 13 Jan 2016 10:40:17 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id GwJYeX06o9dB; Wed, 13 Jan 2016 10:40:17 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 3EF4115F565; Wed, 13 Jan 2016 10:40:17 -0500 (EST) Date: Wed, 13 Jan 2016 10:40:17 -0500 (EST) From: Rick Macklem To: Konstantin Belousov Cc: FreeBSD Filesystems Message-ID: <1351730674.159022044.1452699617235.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <1773157955.158922767.1452698181137.JavaMail.zimbra@uoguelph.ca> References: <1696608910.154845456.1452438117036.JavaMail.zimbra@uoguelph.ca> <20160110154518.GU3625@kib.kiev.ua> <1773157955.158922767.1452698181137.JavaMail.zimbra@uoguelph.ca> Subject: Re: panic ffs_truncate3 (maybe fuse being evil) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF43 (Win)/8.0.9_GA_6191) Thread-Topic: panic ffs_truncate3 (maybe fuse being evil) Thread-Index: 5I8dlwX1ds6laP1djmSqjnkEIr28EdIJ2VW4 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jan 2016 15:40:28 -0000 I wrote: > Kostik wrote: > > On Sun, Jan 10, 2016 at 10:01:57AM -0500, Rick Macklem wrote: > > > Hi, > > > > > > When fooling around with GlusterFS, I can get this panic intermittently. > > > (I had a couple yesterday.) This happens on a Dec. 5, 2015 head kernel. > > > > > > panic: ffs_truncate3 > > > - backtrace without the numbers (I just scribbled it off the screen) > > > ffs_truncate() > > > ufs_inactive() > > > VOP_INACTIVE_APV() > > > vinactive() > > > vputx() > > > kern_unlinkat() > > > > > > So, at a glance, it seems that either > > > b_dirty.bv_cnt > > > or b_clean.bv_cnt > > > is non-zero. (There is another case for the panic, but I thought it > > > was less likely?) > > > > > > So, I'm wondering if this might be another side effect of r291460, > > > since after that a new vnode isn't completely zero'd out? > > > > > > However, shouldn't bo_dirty.bv_cnt and bo_clean.bv_cnt be zero when > > > a vnode is recycled? > > > Does this make sense or do some fields of v_bufobj need to be zero'd > > > out by getnewvnode()? > > Look at the _vdrop(). When a vnode is freed to zone, it is asserted > > that bufobj queues are empty. I very much doubt that it is possible > > to leak either buffers or counters by reuse. > > > > > > > > GlusterFS is using fuse and I suspect that fuse isn't cleaning out > > > the buffers under some circumstance (I already noticed that there > > > isn't any code in its fuse_vnop_reclaim() and I vaguely recall that > > > there are conditions where VOP_INACTIVE() gets skipped, so that > > > VOP_RECLAIM() > > > has to check for anything that would have been done by VOP_INACTIVE() > > > and do it, if it isn't already done.) > > But even if fuse leaves the buffers around, is it UFS which panics for > > you ? I would rather worry about dandling pointers and use after free in > > fuse, which is a known issue with it anyway. I.e. it could be that fuse > > operates on reclaimed and reused vnode as its own. > > > > > > > > Anyhow, if others have thoughts on this (or other hunches w.r.t. what > > > could cause this panic(), please let me know. > > > > The ffs_truncate3 was deterministically triggered by a bug in ffs_balloc(). > > The routine allocated buffers for indirect blocks, but if the blocks cannot > > be allocated, the buffers where left on queue. See r174973, this was fixed > > very long time ago. > > > Well, although I have r174973 in the kernel that crashes, it looks like this > bug might have been around for a while. > Here's what I've figured out sofar. > 1 - The crashes only occur if soft updates are disabled. This isn't > surprising > if you look at ffs_truncate(), since the test for the panic isn't done > when soft updates are enabled. > Here's the snippet from ffs_truncate(), in case you are interested: > if (DOINGSOFTDEP(vp)) { > 335 if (softdeptrunc == 0 && journaltrunc == 0) { > 336 /* > 337 * If a file is only partially truncated, then > 338 * we have to clean up the data structures > 339 * describing the allocation past the truncation > 340 * point. Finding and deallocating those > structures > 341 * is a lot of work. Since partial truncation > occurs > 342 * rarely, we solve the problem by syncing the > file > 343 * so that it will have no data structures left. > 344 */ > 345 if ((error = ffs_syncvnode(vp, MNT_WAIT, 0)) != > 0) > 346 return (error); > 347 } else { > 348 flags = IO_NORMAL | (needextclean ? IO_EXT: 0); > 349 if (journaltrunc) > 350 softdep_journal_freeblocks(ip, cred, > length, > 351 flags); > 352 else > 353 softdep_setup_freeblocks(ip, length, > flags); > 354 ASSERT_VOP_LOCKED(vp, "ffs_truncate1"); > 355 if (journaltrunc == 0) { > 356 ip->i_flag |= IN_CHANGE | IN_UPDATE; > 357 error = ffs_update(vp, 0); > 358 } > 359 return (error); > 360 } > 361 } > You can see that it always returns once in this code block. The only way the > code can get > past this block if soft updates are enabled is a "goto extclean;", which > takes you past > the "panic()". > > By adding a few printf()s, I have determined: > - The bo_clean.bv_cnt == 1 when the panic occurs and the b_lblkno of the > buffer is -ve. > > If you look at vtruncbuf(): > trunclbn = (length + blksize - 1) / blksize; > 1726 > 1727 ASSERT_VOP_LOCKED(vp, "vtruncbuf"); > 1728 restart: > 1729 bo = &vp->v_bufobj; > 1730 BO_LOCK(bo); > 1731 anyfreed = 1; > 1732 for (;anyfreed;) { > 1733 anyfreed = 0; > 1734 TAILQ_FOREACH_SAFE(bp, &bo->bo_clean.bv_hd, b_bobufs, > nbp) { > 1735 if (bp->b_lblkno < trunclbn) > 1736 continue; > When length == 0 --> trunclbn is 0, but the test at line#1735 will skip over > the b_lblkno > because it is negative. > > That is as far as I've gotten. A couple of things I need help from others on: > - Is vtruncbuf() skipping over the cases where b_lblkno < 0 a feature or a > bug? > - If it is a feature, then what needs to be done in the code after the > vtruncbuf() > call in ffs_truncate() to ensure the buffer is gone by the time the panic > check is > done? > --> I do see a bunch of code after the vtruncbuf() call related to indirect > blocks > (which I think use the -ve b_lblkno?), but I'll admit I don't understand > it well > enough to know if it expects vtruncbuf() to leave the -ve block on the > bo_hd list? > > Obviously fixing vtruncbuf() to get rid of these -ve b_lblkno entries would > be easy, > but I don't know if that is a feature or a bug? > > I did look at the commit logs and vtruncbuf() has been like this for at least > 10years. > (I can only guess very few run UFS without soft updates or others would see > these panic()s.) > > I am now running with soft updates enabled to avoid the crashes, but I can > easily test any > patch if others can a patch to try. > Oh, and one more thing. Maybe having the buffer for an indirect block hanging off the vnode at the end of ffs_truncate() to 0 length is ok. After all, this is happening in VOP_INACTIVE() and the vnode isn't being recycled yet? (ie. The panic() test is not needed?) rick > Thanks for your help with this, rick > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >