From owner-freebsd-current@freebsd.org Sat Aug 11 12:05:27 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 30EFC106A17A for ; Sat, 11 Aug 2018 12:05:27 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670076.outbound.protection.outlook.com [40.107.67.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B783479C63 for ; Sat, 11 Aug 2018 12:05:26 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM (52.132.44.160) by YTOPR0101MB1468.CANPRD01.PROD.OUTLOOK.COM (52.132.47.158) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1017.15; Sat, 11 Aug 2018 12:05:25 +0000 Received: from YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM ([fe80::f171:1f28:a0a2:f127]) by YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM ([fe80::f171:1f28:a0a2:f127%2]) with mapi id 15.20.1017.022; Sat, 11 Aug 2018 12:05:25 +0000 From: Rick Macklem To: Konstantin Belousov CC: "freebsd-current@FreeBSD.org" , "peter@holm.cc" Subject: Re: ffs_truncate3 panics Thread-Topic: ffs_truncate3 panics Thread-Index: AQHULkj8zfSrFB+Dkkqu0NmeGvZbQKS0RKaAgAGEoXWAAKUlgIAANVYCgACitwCAAJvZIoABYIyAgAEw+G8= Date: Sat, 11 Aug 2018 12:05:25 +0000 Message-ID: References: <20180807131445.GC1884@kib.kiev.ua> <20180808221647.GH1884@kib.kiev.ua> <20180809111004.GK1884@kib.kiev.ua> , <20180810172941.GA2113@kib.kiev.ua> In-Reply-To: <20180810172941.GA2113@kib.kiev.ua> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YTOPR0101MB1468; 6:syowBr8VQHPInBC0d+9Unjbat0TUrdNk5PBKfOB96nPqVm3f6iB4/b6f/Ddp0N2kREFK32sbPIQPhRontvq5RjLvr+b2QVVe2lTzCbFvhPHIjiLMftiH+3S2rk/q7bUw3BGqAAxaHXMUrUlNwmzAGFrOx4TB0sezWuBWz7++b8/vkEfBZR2okB+ead/LDWHWIOaCzRQsIjkG3cXFWHCULgPG69fkZIB4QrgvDZw5EvlIPXieizBq9HSii1rVKb0Fek1tB7QCWkYt3ysTwfkc7J9IMye5nNjipZ6TTL830GUjzv/qMhoruiK/hiAcV6edcPV/hyiqY0Aa7Lqqz4oVvU3EX8CVF8joi/Sug4QKezZM/99eYgmQL3orSmpJGpdjymdcYYlUdr6PXL6THbHPzuu+EXmzhyO9Yucp/UGWClnMLxBaF4AwKdbxSDJZ+Ld6daqArzfUDg7AzxJxT+YSSw==; 5:RLG96wuUts4MC4qNZORym+LwsTiWuYpz7lL2C/kMbArL/QzwofRrV8Z4GAzMfTBPsLFlYjFxHOY5YPu3f2zI2gvpDnJjUkoixVPdcXnORQxU8058pToOfDPGDFMunKioTQulUSQ9UYdNkpc/6AItm/D5qoStKVI4Azy4K42DLno=; 7:x/aYUF6CFfLXKaB1TEh1M+SKR4Qv//VQ73tcy/uV8pYeb26BczwMknMJotJBJQkAOPOxttxOJ6Ooj/6wOK8Vb7R7X8Inu7/1xYGgEjgCeAd5K3CcfuU4UW89uJQqrsKP70N6RsEq8BPYkyTvjUsi+qKFwsqCxYgH+vMf4nPrC4v0fui3kDAqS4CSxvY2B7wduSmxMzeHOujbrflzH3Es8Gt446SstCOlEHLS3F0/+ZJ8Vrsi/MyzC4I+LhFe1bWF x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: ae99d1a9-d860-4db1-e18e-08d5ff82b8a1 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989117)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(5600074)(711020)(2017052603328)(7153060)(7193020); SRVR:YTOPR0101MB1468; x-ms-traffictypediagnostic: YTOPR0101MB1468: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3231311)(944501410)(52105095)(3002001)(10201501046)(149027)(150027)(6041310)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123564045)(20161123562045)(20161123560045)(6072148)(201708071742011)(7699016); SRVR:YTOPR0101MB1468; BCL:0; PCL:0; RULEID:; SRVR:YTOPR0101MB1468; x-forefront-prvs: 0761DE1EDD x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(366004)(39860400002)(376002)(346002)(396003)(51444003)(199004)(189003)(8676002)(81156014)(81166006)(14454004)(39060400002)(786003)(6916009)(6506007)(33656002)(26005)(4326008)(102836004)(186003)(8936002)(54906003)(74316002)(478600001)(316002)(305945005)(105586002)(25786009)(99286004)(2906002)(1411001)(106356001)(5250100002)(446003)(6436002)(486006)(476003)(11346002)(229853002)(68736007)(74482002)(5660300001)(256004)(2900100001)(7696005)(53936002)(7116003)(6246003)(86362001)(97736004)(55016002)(93886005)(76176011)(9686003); DIR:OUT; SFP:1101; SCL:1; SRVR:YTOPR0101MB1468; H:YTOPR0101MB1820.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: j4ZeDeaf1tDoQra3RgblpPICUgqABB4Y2497Syo+h5+6bA8QIcdGEpBMf9HsRgEVl/9hQE0kj2MLSs5vzG0BTGG4XemvWDoL9oc/6ojth6w+xhxOM4VyQIcm0CsFqVqbm9J6hWmcawkr97KHA3tdFIWTSLoWC94DlafyosFF/ceNr/qtT2lfDtQPMFaMmSJK8dihokrGb6JT9uW4rIhoVPfgxKyr4u08EXwJPqUB7DjrL4qVPB3D20wnni3kWtvfG+YOTpNHQgHhqZF4ot1h8R+MuncBsV79eliFE8hmU74xLa9MDIFmJOu56ocK+8k3TFYOdBjsGwG9i8CsvfzuufZNlO0Hi5OgRuNY86CKpi4= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: ae99d1a9-d860-4db1-e18e-08d5ff82b8a1 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Aug 2018 12:05:25.0772 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR0101MB1468 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Aug 2018 12:05:27 -0000 Konstantin Belousov wrote: >On Thu, Aug 09, 2018 at 08:38:50PM +0000, Rick Macklem wrote: >> >BTW, does NFS server use extended attributes ? What for ? Can you, pl= ease, >> >point out the code which does this ? >> For the pNFS service, there are two system namespace extended attributes= for >> each file stored on the service. >> pnfsd.dsfile - Stores where the data for the file is. Can be displayed b= y the >> pnfsdsfile(8) command. >> >> pnfsd.dsattr - Cached attributes that change when a file is written (siz= e, mtime, >> change) so that the MDS doesn't have to do a Getattr on the data server = for every client Getattr. >> > >My reading of the nfsd code + ffs extattr handling reminds me that you >already reported this issue some time ago. I suspected ufs_balloc() at >that time. Yes. I had almost forgotten about them, because I have been testing with a couple of machines (not big, but amd64 with a few Gbytes of RAM) and they never hit the panic(). Recently, I've been using the 256Mbyte i386 and star= ted seeing them again. >Now I think that the situation with the stray buffers hanging on the >queue is legitimate, ffs_extread() might create such buffer and release >it to a clean queue, then removal of the file would see inode with no >allocated ext blocks but with the buffer. > >I think the easiest way to handle it is to always flush buffers and pages >in the ext attr range, regardless of the number of allocated ext blocks. >Patch below was not tested. [patch deleted for brevity] Well, the above sounds reasonable, but the patch didn't help. Here's a small portion of the log a test run last night. - First, a couple of things about the printf()s. When they start with "CL= =3D", the printf() is at the start of ffs_truncate(). "" is a static counter= of calls to ffs_truncate(), so "same value" indicates same call. CL=3D31816 flags=3D0xc00 vtyp=3D1 bodirty=3D0 boclean=3D1 diextsiz=3D320 buf at 0x429f260 b_flags =3D 0x20001020, b_xflags=3D0x2, b_vflags= =3D0x0 b_error =3D 0, b_bufsize =3D 4096, b_bcount =3D 4096, b_resid =3D 0 b_bufobj =3D (0xfa3f734), b_data =3D 0x4c90000, b_blkno =3D -1, b_lblkno = =3D -1, b_dep =3D 0 b_kvabase =3D 0x4c90000, b_kvasize =3D 32768 CL=3D34593 flags=3D0xc00 vtyp=3D1 bodirty=3D0 boclean=3D1 diextsiz=3D320 buf at 0x429deb0 b_flags =3D 0x20001020, b_xflags=3D0x2, b_vflags= =3D0x0 b_error =3D 0, b_bufsize =3D 4096, b_bcount =3D 4096, b_resid =3D 0 b_bufobj =3D (0xfd3da94), b_data =3D 0x5700000, b_blkno =3D -1, b_lblkno = =3D -1, b_dep =3D 0 b_kvabase =3D 0x5700000, b_kvasize =3D 32768 FFST3=3D34593 vtyp=3D1 bodirty=3D0 boclean=3D1 buf at 0x429deb0 b_flags =3D 0x20001020, b_xflags=3D0x2, b_vflags= =3D0x0 b_error =3D 0, b_bufsize =3D 4096, b_bcount =3D 4096, b_resid =3D 0 b_bufobj =3D (0xfd3da94), b_data =3D 0x5700000, b_blkno =3D -1, b_lblkno = =3D -1, b_dep =3D 0 b_kvabase =3D 0x5700000, b_kvasize =3D 32768 So, the first one is what typically happens and there would be no panic(). The second/third would be a panic(), since the one that starts with "FFST3= " is a printf() that replaces the panic() call. - Looking at the second/third, the number at the beginning is the same, so = it is the same call, but for some reason, between the start of the function and where the ffs_truncate3 panic() test is, di_extsize has been set to 0, bu= t the buffer is still there (or has been re-created there by another thread?). Looking at the code, I can't see how this could happen, since there is a vi= nvalbuf() call after the only place in the code that sets di_extsize =3D=3D 0, from w= hat I can see? I am going to add printf()s after the vinvalbuf() calls, to make sure they = are happening and getting rid of the buffer. If another thread could somehow (re)create the buffer concurrently with the ffs_truncate() call, that would explain it, I think? Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the= small size of the machine and this makes the behaviour of ffs_truncate() confusin= g. I'll post again when I have more info. Thanks for looking at it, rick