From owner-freebsd-current@freebsd.org Sat Aug 11 12:38:13 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 11770106AC7A for ; Sat, 11 Aug 2018 12:38:13 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9141C7A90F for ; Sat, 11 Aug 2018 12:38:12 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id w7BCbt1w073998 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 11 Aug 2018 15:37:59 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua w7BCbt1w073998 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id w7BCbtmS073996; Sat, 11 Aug 2018 15:37:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 11 Aug 2018 15:37:55 +0300 From: Konstantin Belousov To: Rick Macklem Cc: "freebsd-current@FreeBSD.org" , "peter@holm.cc" Subject: Re: ffs_truncate3 panics Message-ID: <20180811123755.GD2113@kib.kiev.ua> References: <20180807131445.GC1884@kib.kiev.ua> <20180808221647.GH1884@kib.kiev.ua> <20180809111004.GK1884@kib.kiev.ua> <20180810172941.GA2113@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Aug 2018 12:38:13 -0000 On Sat, Aug 11, 2018 at 12:05:25PM +0000, Rick Macklem wrote: > Konstantin Belousov wrote: > >On Thu, Aug 09, 2018 at 08:38:50PM +0000, Rick Macklem wrote: > >> >BTW, does NFS server use extended attributes ? What for ? Can you, please, > >> >point out the code which does this ? > >> For the pNFS service, there are two system namespace extended attributes for > >> each file stored on the service. > >> pnfsd.dsfile - Stores where the data for the file is. Can be displayed by the > >> pnfsdsfile(8) command. > >> > >> pnfsd.dsattr - Cached attributes that change when a file is written (size, mtime, > >> change) so that the MDS doesn't have to do a Getattr on the data server for every client Getattr. > >> > > > >My reading of the nfsd code + ffs extattr handling reminds me that you > >already reported this issue some time ago. I suspected ufs_balloc() at > >that time. > Yes. I had almost forgotten about them, because I have been testing with a > couple of machines (not big, but amd64 with a few Gbytes of RAM) and they > never hit the panic(). Recently, I've been using the 256Mbyte i386 and started > seeing them again. > > >Now I think that the situation with the stray buffers hanging on the > >queue is legitimate, ffs_extread() might create such buffer and release > >it to a clean queue, then removal of the file would see inode with no > >allocated ext blocks but with the buffer. > > > >I think the easiest way to handle it is to always flush buffers and pages > >in the ext attr range, regardless of the number of allocated ext blocks. > >Patch below was not tested. > [patch deleted for brevity] > Well, the above sounds reasonable, but the patch didn't help. > Here's a small portion of the log a test run last night. > - First, a couple of things about the printf()s. When they start with "CL=", > the printf() is at the start of ffs_truncate(). "" is a static counter of calls to > ffs_truncate(), so "same value" indicates same call. > > > CL=31816 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320 > buf at 0x429f260 > b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0 > b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0 > b_bufobj = (0xfa3f734), b_data = 0x4c90000, b_blkno = -1, b_lblkno = -1, b_dep = 0 > b_kvabase = 0x4c90000, b_kvasize = 32768 > > CL=34593 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320 > buf at 0x429deb0 > b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0 > b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0 > b_bufobj = (0xfd3da94), b_data = 0x5700000, b_blkno = -1, b_lblkno = -1, b_dep = 0 > b_kvabase = 0x5700000, b_kvasize = 32768 > > FFST3=34593 vtyp=1 bodirty=0 boclean=1 > buf at 0x429deb0 > b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0 > b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0 > b_bufobj = (0xfd3da94), b_data = 0x5700000, b_blkno = -1, b_lblkno = -1, b_dep = 0 > b_kvabase = 0x5700000, b_kvasize = 32768 Problem with this buffer is that BX_ALTDATA bit is not set. This is the reason why vinvalbuf(V_ALT) skips it. > > So, the first one is what typically happens and there would be no panic(). > The second/third would be a panic(), since the one that starts with "FFST3" > is a printf() that replaces the panic() call. > - Looking at the second/third, the number at the beginning is the same, so it is > the same call, but for some reason, between the start of the function and > where the ffs_truncate3 panic() test is, di_extsize has been set to 0, but the > buffer is still there (or has been re-created there by another thread?). > > Looking at the code, I can't see how this could happen, since there is a vinvalbuf() > call after the only place in the code that sets di_extsize == 0, from what I can see? > I am going to add printf()s after the vinvalbuf() calls, to make sure they are > happening and getting rid of the buffer. > > If another thread could somehow (re)create the buffer concurrently with the > ffs_truncate() call, that would explain it, I think? The vnode is exclusively locked. Other thread must not be able to instantiate a buffer under us. > > Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the small > size of the machine and this makes the behaviour of ffs_truncate() confusing. This is the patch that I posted long time ago. It is obviously related to missed BX_ALTDATA. Can you add this patch to your kernel ? diff --git a/sys/ufs/ffs/ffs_balloc.c b/sys/ufs/ffs/ffs_balloc.c index 552c295753d..6d89a229ea7 100644 --- a/sys/ufs/ffs/ffs_balloc.c +++ b/sys/ufs/ffs/ffs_balloc.c @@ -682,8 +682,16 @@ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int size, ffs_blkpref_ufs2(ip, lbn, (int)lbn, &dp->di_extb[0]), osize, nsize, flags, cred, &bp); - if (error) + if (error != 0) { + /* getblk does truncation, if needed */ + bp = getblk(vp, -1 - lbn, osize, 0, 0, + GB_NOCREAT); + if (bp != NULL) { + bp->b_xflags |= BX_ALTDATA; + brelse(bp); + } return (error); + } bp->b_xflags |= BX_ALTDATA; if (DOINGSOFTDEP(vp)) softdep_setup_allocext(ip, lbn, @@ -699,8 +707,17 @@ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int size, error = ffs_alloc(ip, lbn, ffs_blkpref_ufs2(ip, lbn, (int)lbn, &dp->di_extb[0]), nsize, flags, cred, &newb); - if (error) + if (error != 0) { + bp = getblk(vp, -1 - lbn, nsize, 0, 0, + GB_NOCREAT); + if (bp != NULL) { + bp->b_xflags |= BX_ALTDATA; + bp->b_flags |= B_RELBUF | B_INVAL; + bp->b_flags &= ~B_ASYNC; + brelse(bp); + } return (error); + } bp = getblk(vp, -1 - lbn, nsize, 0, 0, gbflags); bp->b_blkno = fsbtodb(fs, newb); bp->b_xflags |= BX_ALTDATA;