Date: Sat, 11 Aug 2018 15:37:55 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>, "peter@holm.cc" <peter@holm.cc> Subject: Re: ffs_truncate3 panics Message-ID: <20180811123755.GD2113@kib.kiev.ua> In-Reply-To: <YTOPR0101MB18203F316E929C7AF7576BF0DD3B0@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> References: <YTOPR0101MB18206289DDED97BE9DD38D14DD270@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180807131445.GC1884@kib.kiev.ua> <YTOPR0101MB18207C97903D3058A15091FFDD260@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180808221647.GH1884@kib.kiev.ua> <YTOPR0101MB18205CDE34ABCC2345C3F172DD250@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180809111004.GK1884@kib.kiev.ua> <YTOPR0101MB182067FF6F908E0B2EBAE3D1DD250@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180810172941.GA2113@kib.kiev.ua> <YTOPR0101MB18203F316E929C7AF7576BF0DD3B0@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 11, 2018 at 12:05:25PM +0000, Rick Macklem wrote: > Konstantin Belousov wrote: > >On Thu, Aug 09, 2018 at 08:38:50PM +0000, Rick Macklem wrote: > >> >BTW, does NFS server use extended attributes ? What for ? Can you, please, > >> >point out the code which does this ? > >> For the pNFS service, there are two system namespace extended attributes for > >> each file stored on the service. > >> pnfsd.dsfile - Stores where the data for the file is. Can be displayed by the > >> pnfsdsfile(8) command. > >> > >> pnfsd.dsattr - Cached attributes that change when a file is written (size, mtime, > >> change) so that the MDS doesn't have to do a Getattr on the data server for every client Getattr. > >> > > > >My reading of the nfsd code + ffs extattr handling reminds me that you > >already reported this issue some time ago. I suspected ufs_balloc() at > >that time. > Yes. I had almost forgotten about them, because I have been testing with a > couple of machines (not big, but amd64 with a few Gbytes of RAM) and they > never hit the panic(). Recently, I've been using the 256Mbyte i386 and started > seeing them again. > > >Now I think that the situation with the stray buffers hanging on the > >queue is legitimate, ffs_extread() might create such buffer and release > >it to a clean queue, then removal of the file would see inode with no > >allocated ext blocks but with the buffer. > > > >I think the easiest way to handle it is to always flush buffers and pages > >in the ext attr range, regardless of the number of allocated ext blocks. > >Patch below was not tested. > [patch deleted for brevity] > Well, the above sounds reasonable, but the patch didn't help. > Here's a small portion of the log a test run last night. > - First, a couple of things about the printf()s. When they start with "CL=<N>", > the printf() is at the start of ffs_truncate(). "<N>" is a static counter of calls to > ffs_truncate(), so "same value" indicates same call. > > > CL=31816 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320 > buf at 0x429f260 > b_flags = 0x20001020<vmio,reuse,cache>, b_xflags=0x2<clean>, b_vflags=0x0 > b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0 > b_bufobj = (0xfa3f734), b_data = 0x4c90000, b_blkno = -1, b_lblkno = -1, b_dep = 0 > b_kvabase = 0x4c90000, b_kvasize = 32768 > > CL=34593 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320 > buf at 0x429deb0 > b_flags = 0x20001020<vmio,reuse,cache>, b_xflags=0x2<clean>, b_vflags=0x0 > b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0 > b_bufobj = (0xfd3da94), b_data = 0x5700000, b_blkno = -1, b_lblkno = -1, b_dep = 0 > b_kvabase = 0x5700000, b_kvasize = 32768 > > FFST3=34593 vtyp=1 bodirty=0 boclean=1 > buf at 0x429deb0 > b_flags = 0x20001020<vmio,reuse,cache>, b_xflags=0x2<clean>, b_vflags=0x0 > b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0 > b_bufobj = (0xfd3da94), b_data = 0x5700000, b_blkno = -1, b_lblkno = -1, b_dep = 0 > b_kvabase = 0x5700000, b_kvasize = 32768 Problem with this buffer is that BX_ALTDATA bit is not set. This is the reason why vinvalbuf(V_ALT) skips it. > > So, the first one is what typically happens and there would be no panic(). > The second/third would be a panic(), since the one that starts with "FFST3" > is a printf() that replaces the panic() call. > - Looking at the second/third, the number at the beginning is the same, so it is > the same call, but for some reason, between the start of the function and > where the ffs_truncate3 panic() test is, di_extsize has been set to 0, but the > buffer is still there (or has been re-created there by another thread?). > > Looking at the code, I can't see how this could happen, since there is a vinvalbuf() > call after the only place in the code that sets di_extsize == 0, from what I can see? > I am going to add printf()s after the vinvalbuf() calls, to make sure they are > happening and getting rid of the buffer. > > If another thread could somehow (re)create the buffer concurrently with the > ffs_truncate() call, that would explain it, I think? The vnode is exclusively locked. Other thread must not be able to instantiate a buffer under us. > > Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the small > size of the machine and this makes the behaviour of ffs_truncate() confusing. This is the patch that I posted long time ago. It is obviously related to missed BX_ALTDATA. Can you add this patch to your kernel ? diff --git a/sys/ufs/ffs/ffs_balloc.c b/sys/ufs/ffs/ffs_balloc.c index 552c295753d..6d89a229ea7 100644 --- a/sys/ufs/ffs/ffs_balloc.c +++ b/sys/ufs/ffs/ffs_balloc.c @@ -682,8 +682,16 @@ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int size, ffs_blkpref_ufs2(ip, lbn, (int)lbn, &dp->di_extb[0]), osize, nsize, flags, cred, &bp); - if (error) + if (error != 0) { + /* getblk does truncation, if needed */ + bp = getblk(vp, -1 - lbn, osize, 0, 0, + GB_NOCREAT); + if (bp != NULL) { + bp->b_xflags |= BX_ALTDATA; + brelse(bp); + } return (error); + } bp->b_xflags |= BX_ALTDATA; if (DOINGSOFTDEP(vp)) softdep_setup_allocext(ip, lbn, @@ -699,8 +707,17 @@ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int size, error = ffs_alloc(ip, lbn, ffs_blkpref_ufs2(ip, lbn, (int)lbn, &dp->di_extb[0]), nsize, flags, cred, &newb); - if (error) + if (error != 0) { + bp = getblk(vp, -1 - lbn, nsize, 0, 0, + GB_NOCREAT); + if (bp != NULL) { + bp->b_xflags |= BX_ALTDATA; + bp->b_flags |= B_RELBUF | B_INVAL; + bp->b_flags &= ~B_ASYNC; + brelse(bp); + } return (error); + } bp = getblk(vp, -1 - lbn, nsize, 0, 0, gbflags); bp->b_blkno = fsbtodb(fs, newb); bp->b_xflags |= BX_ALTDATA;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180811123755.GD2113>