Date: Sat, 11 Aug 2018 12:05:25 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Konstantin Belousov <kostikbel@gmail.com> Cc: "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>, "peter@holm.cc" <peter@holm.cc> Subject: Re: ffs_truncate3 panics Message-ID: <YTOPR0101MB18203F316E929C7AF7576BF0DD3B0@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <20180810172941.GA2113@kib.kiev.ua> References: <YTOPR0101MB18206289DDED97BE9DD38D14DD270@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180807131445.GC1884@kib.kiev.ua> <YTOPR0101MB18207C97903D3058A15091FFDD260@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180808221647.GH1884@kib.kiev.ua> <YTOPR0101MB18205CDE34ABCC2345C3F172DD250@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM> <20180809111004.GK1884@kib.kiev.ua> <YTOPR0101MB182067FF6F908E0B2EBAE3D1DD250@YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM>, <20180810172941.GA2113@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Konstantin Belousov wrote: >On Thu, Aug 09, 2018 at 08:38:50PM +0000, Rick Macklem wrote: >> >BTW, does NFS server use extended attributes ? What for ? Can you, pl= ease, >> >point out the code which does this ? >> For the pNFS service, there are two system namespace extended attributes= for >> each file stored on the service. >> pnfsd.dsfile - Stores where the data for the file is. Can be displayed b= y the >> pnfsdsfile(8) command. >> >> pnfsd.dsattr - Cached attributes that change when a file is written (siz= e, mtime, >> change) so that the MDS doesn't have to do a Getattr on the data server = for every client Getattr. >> > >My reading of the nfsd code + ffs extattr handling reminds me that you >already reported this issue some time ago. I suspected ufs_balloc() at >that time. Yes. I had almost forgotten about them, because I have been testing with a couple of machines (not big, but amd64 with a few Gbytes of RAM) and they never hit the panic(). Recently, I've been using the 256Mbyte i386 and star= ted seeing them again. >Now I think that the situation with the stray buffers hanging on the >queue is legitimate, ffs_extread() might create such buffer and release >it to a clean queue, then removal of the file would see inode with no >allocated ext blocks but with the buffer. > >I think the easiest way to handle it is to always flush buffers and pages >in the ext attr range, regardless of the number of allocated ext blocks. >Patch below was not tested. [patch deleted for brevity] Well, the above sounds reasonable, but the patch didn't help. Here's a small portion of the log a test run last night. - First, a couple of things about the printf()s. When they start with "CL= =3D<N>", the printf() is at the start of ffs_truncate(). "<N>" is a static counter= of calls to ffs_truncate(), so "same value" indicates same call. CL=3D31816 flags=3D0xc00 vtyp=3D1 bodirty=3D0 boclean=3D1 diextsiz=3D320 buf at 0x429f260 b_flags =3D 0x20001020<vmio,reuse,cache>, b_xflags=3D0x2<clean>, b_vflags= =3D0x0 b_error =3D 0, b_bufsize =3D 4096, b_bcount =3D 4096, b_resid =3D 0 b_bufobj =3D (0xfa3f734), b_data =3D 0x4c90000, b_blkno =3D -1, b_lblkno = =3D -1, b_dep =3D 0 b_kvabase =3D 0x4c90000, b_kvasize =3D 32768 CL=3D34593 flags=3D0xc00 vtyp=3D1 bodirty=3D0 boclean=3D1 diextsiz=3D320 buf at 0x429deb0 b_flags =3D 0x20001020<vmio,reuse,cache>, b_xflags=3D0x2<clean>, b_vflags= =3D0x0 b_error =3D 0, b_bufsize =3D 4096, b_bcount =3D 4096, b_resid =3D 0 b_bufobj =3D (0xfd3da94), b_data =3D 0x5700000, b_blkno =3D -1, b_lblkno = =3D -1, b_dep =3D 0 b_kvabase =3D 0x5700000, b_kvasize =3D 32768 FFST3=3D34593 vtyp=3D1 bodirty=3D0 boclean=3D1 buf at 0x429deb0 b_flags =3D 0x20001020<vmio,reuse,cache>, b_xflags=3D0x2<clean>, b_vflags= =3D0x0 b_error =3D 0, b_bufsize =3D 4096, b_bcount =3D 4096, b_resid =3D 0 b_bufobj =3D (0xfd3da94), b_data =3D 0x5700000, b_blkno =3D -1, b_lblkno = =3D -1, b_dep =3D 0 b_kvabase =3D 0x5700000, b_kvasize =3D 32768 So, the first one is what typically happens and there would be no panic(). The second/third would be a panic(), since the one that starts with "FFST3= " is a printf() that replaces the panic() call. - Looking at the second/third, the number at the beginning is the same, so = it is the same call, but for some reason, between the start of the function and where the ffs_truncate3 panic() test is, di_extsize has been set to 0, bu= t the buffer is still there (or has been re-created there by another thread?). Looking at the code, I can't see how this could happen, since there is a vi= nvalbuf() call after the only place in the code that sets di_extsize =3D=3D 0, from w= hat I can see? I am going to add printf()s after the vinvalbuf() calls, to make sure they = are happening and getting rid of the buffer. If another thread could somehow (re)create the buffer concurrently with the ffs_truncate() call, that would explain it, I think? Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the= small size of the machine and this makes the behaviour of ffs_truncate() confusin= g. I'll post again when I have more info. Thanks for looking at it, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTOPR0101MB18203F316E929C7AF7576BF0DD3B0>