From owner-freebsd-hackers@freebsd.org Sat Sep 5 04:14:22 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 238793DDF13 for ; Sat, 5 Sep 2020 04:14:22 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660071.outbound.protection.outlook.com [40.107.66.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Bk1QP0pMRz3bLD; Sat, 5 Sep 2020 04:14:20 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NG1p9gmBY5uXwnhjs5I5yzfYjH7TSkT07jjFpfeJ277wSh8d3e9PXWOMrYM1Tk2vZoULohbwdyrqUNO1xGSLA4zMIB/XHLZoO9f/aSuCgh5RDSw1iR42jRrC7FR24Jio9iZ6cA1BaasM8og8x9+INvsoh3kEqcpx4rni3sSlj/aKi7hlOX7NYgdGWVDy63pyAfK7oPnEYPOWwT0CJ1RqeZY3r9aGBFUiCeIuKU1rtH3tSG7sw0VSjjcgTmbhAH3WKvs4wXndUu8vYvANsfRcVsSwPOpSXOGpzW9aq4++v0jiBmo+pnSAToE4PuYm/Ai0Pqm8u+t6zp/KivH4Cj48aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DpUZMDl1CiXTJawR/+JPTbalc2p1jv3a0jqgTq81Cgs=; b=adrqkd9vLO+jaKeb1PN30IhZSHq1MPpxkZI3CCOaIeOaWAZO9+SRiXgrYGtjJNPbCpldG5pr2KvJZaSR+YYehSSQ8jlI8MV/+9LXHkqixNqGud56f+K7ygveRL0b8Ma2x+Z3Etavufv33ajlYLA+n9gAiHCAiQZjE7U1Q0vytQjCQvR5MPbAM+FIJFeOyV52ZPtvvND7oDRIy4fzl1dTc5UhVE0I5xSYmgTqv0CjhECeGXR4qnYbBSRvZA98Hc0OgWO1YRx0D2wcEAHfkPxylRqePQXYIJyPkI+Ry+N99UZoze8iGBW7xYoqEPjpHXovl3z9qXuY64rFRZWUHsBXsQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DpUZMDl1CiXTJawR/+JPTbalc2p1jv3a0jqgTq81Cgs=; b=nC07dK4aDu9eLpTuQhgHfVPrfzNOUIFxTOHGiT0K87Q82QGaiE6UBHRvkKgLkkDwqXS9T6ce55ljDDRpdZIYYUoLDQ6+4AL2y4gSBHslhsmJLlH4A/rmYxyxrPrpmSn165LwAsl9y8RHYlgP7saCEAPC++FrhZ3IYAWp446kwrTpQQS9HyZRLxyZ0pZoi6aJW+aFBykuIHh6diFWeQIAysxaxXMNkFD+qjpCAu7bYO2GYSgfWAWb0eqnxgqbtXUSmqU9u1nX+zrko9gcSvxIenRWJ9MDEFfXazkdGFV8j9q8AHOmeBSIiNt4n6P7fzYjyxB5CmiJ1rcA5rBLRnfRNQ== Received: from YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:24::27) by YTOPR0101MB2316.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b00:20::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3348.16; Sat, 5 Sep 2020 04:14:19 +0000 Received: from YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM ([fe80::687f:d85a:a0a3:bd20]) by YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM ([fe80::687f:d85a:a0a3:bd20%6]) with mapi id 15.20.3348.017; Sat, 5 Sep 2020 04:14:19 +0000 From: Rick Macklem To: Alan Somers , FreeBSD Hackers Subject: Re: panic!("docallb") in nfsrv_docallback Thread-Topic: panic!("docallb") in nfsrv_docallback Thread-Index: AQHWgxtspJUVGlzLXkCTJBO3gNcFCalZaLF0 Date: Sat, 5 Sep 2020 04:14:19 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 80bfd280-0398-4548-262d-08d85152291b x-ms-traffictypediagnostic: YTOPR0101MB2316: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: UZhKvJaOzF2q0Jc0Z1qCPnbW81d2cC0eImGhv5EsvFk3i2CzzsjedbyHYko4XHHC2YZKw3bNeO8y9cjMc4aJyZFLKiak2TzYWkULSQY2LU3SpJq2/4oDUyF55cyCTx2w9I7Wo+UGOsw7t4D1G/OYF7l9LRgW+XiC+Wa4K7/AxNGTi7Rzm6djLcZrrBcsq3++VwdpDz6uMRMYbXY8JpaeP2fyz65ZKE2Vl4GA5s52LugWM0ptPq46oaha+DbSzMG7fiCKncrXol/Nd96IB6oHlVjs2PFjxF0tKRuGQ7wjLBs/4k51ZDCYJe6DF0WwkcGmbwBhIXq1vnlgO19Sw0OCBQqG+pHEyljzf1PhHfnvsi9oyiOppdllJ4gOkybtdzVMXAoPQocdCqBXmH0jmQDNHA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(136003)(376002)(366004)(39860400002)(346002)(396003)(66446008)(7696005)(966005)(6506007)(9686003)(55016002)(66476007)(66556008)(64756008)(2906002)(450100002)(786003)(316002)(110136005)(8936002)(186003)(76116006)(478600001)(8676002)(66946007)(52536014)(71200400001)(91956017)(5660300002)(86362001)(83380400001)(33656002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: ssrEHJhdODm9lmp6rB8jxQ1sXZxnYoX+KiHh6Ijrr/FI8SMhr/1vOrhnVcBMY2biNK3b/hoXOzRQhWAi4LPzdIcVWFNbOd5eWz48MvrzEPHcdoq+QHeMlf3IxE6SNZAT9vALax73IiWhjSQiaV/Va+feGYFzYm1vI9IMMh7RNFwi0e0DtSYNt1m0LCIzG4uoGLX5gx/CTeEI8taNA3+HuMgVkMAB34t/cbHQCUhK7QJsbStyJGgzgduF3trIbxXmU12wuhOez6XYXfHmsULtaR2okHm04pEeXC+DxC3/yES5BftwHdazb/M7sIdUX6Lrn/tmVtORdMVORzpDez1aWq9ASU3m+3ilNqVs/RW5WOdCVEHc3dvXBh3pKb2+maKqpnFU1HywQE8LMJV2DMQR+7FRPFpGNtC1SMHJ8eo03nNCQl3p/0GAN4wUfIIb8ywkAjPAxYGG1ITyKykC4sEpzyTPlKQ/26/rlU0jZzNt5/zChTbgJ/iF9iEWQjyXHPIiofkeEvoMz/uZvC3aP7BjGpyBsClQWY2yKWUGJO+ex4L7xE7FfrNDkJTNGm2Mf8jn6Aqy0EpiuhiZEmZBET+H9t4EgWQxvRMc/pk3lgbLAZ0ZeHqDbN1sBMoSVkvxS1O/ETiFbrrcHfDPa3yPZ92OSHtbK4Wu0SaULCPVGjYyPhM6/gkQyrLoPHK3+Cfuz7pMi0JO1fiADnsIR4mb6E6ppw== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 80bfd280-0398-4548-262d-08d85152291b X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Sep 2020 04:14:19.0888 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: bvPV9/TKmOBZzAHeiLZN/EKQ0FuZIMH0KLA8WVvddyfoGHxoZ51Ncyp4CGRwBHAfsM2RwQwEHqVKsp9+93EyBw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR0101MB2316 X-Rspamd-Queue-Id: 4Bk1QP0pMRz3bLD X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=nC07dK4a; dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.71 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.21 / 15.00]; NEURAL_HAM_MEDIUM(-1.04)[-1.039]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.02)[-1.020]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; SUBJECT_HAS_EXCLAIM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.71:from]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.15)[-1.154]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-hackers]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.71:from] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Sep 2020 04:14:22 -0000 Alan Somers wrote:=0A= >I just saw this panic on a 12-stable machine. Unfortunately, I don't have= =0A= >a core dump, just a stack trace. It was serving NFS v4.0, with delegation= s=0A= >enabled. The clients were all Debian, with Linux 3.16.0.=0A= >=0A= >The proximal cause of the panic seems to be that the file had a write=0A= >delegation issued to an unconfirmed client. Root cause is harder to=0A= >determine. Did the kernel previously issue a delegation to an unconfirmed= =0A= >client? Or did the client somehow change to an unconfirmed state after th= e=0A= >delegation was issued, perhaps due to a race?=0A= >=0A= >It's hard to tell, but I don't see any checks for lc_flags &=0A= >LCL_NEEDSCONFIRM in nfsrv_openctrl (which issues the delegations), so I'm= =0A= >guessing that that's the problem.=0A= I guess I should have looked at the code before doing the last post.=0A= The check is in nfsrv_getclient(), that is called by nfsrv_opencheck().=0A= nfsrv_opencheck() - Checks to see if an Open is allowed.=0A= nfsrv_openctrl() - Does the Open, assuming nfsrv_opencheck() determined it= =0A= was allowed.=0A= =0A= > If so, then the event trace would look=0A= >like this:=0A= >=0A= >1) Client Alice sends SETCLIENTID. The server creates a client state=0A= >structure=0A= > for her.=0A= >_) Client Alice should've sent SETCLIENTID_CONFIRM, but doesn't. Bad Alic= e!=0A= >2) Client Alice sends OPEN for some file, and is issued a write delegation= .=0A= > The server shouldn't have issued it, because Alice's client ID is=0A= > unconfirmed. Bad server!=0A= I don't think this can happen. From looking at the code, an NFSERR_EXPIRED= =0A= reply to the Open should have happened.=0A= =0A= >3) Client Bob tries to do a GETATTR on that same file.=0A= >4) In nfsrv_checkgetattr, the kernel finds a write delegation for that fil= e,=0A= > owned by client Alice.=0A= I think the server needs to check for LCL_NEEDSCONFIRM in here.=0A= It gets the "clp" from the FH, but it "assumes" a confirmed ClientID.=0A= =0A= I'll code up a patch to add this check to nfsrv_checkgetattr().=0A= =0A= >5) The kernel tries to send a NFSV4OP_CBGETATTR callback to Alice, to see= =0A= >if the=0A= > file's attributes have changed.=0A= >6) But Alice's client ID is unconfirmed. Oh no! Panic!=0A= >=0A= >Does this sound plausible? Should there be a check for LCL_NEEDSCONFIRM= =0A= >somewhere around line 3166 in nfs_nfsdstate.c? Grateful for any help.=0A= Well, it doesn't appear that the Open could occur when the ClientID was=0A= not confirmed.=0A= --> The obvious case you listed above is caught by nfsrv_opencheck().=0A= Now, could a SetClientID happen between nfsrv_opencheck() and=0A= nfsrv_openctrl()?=0A= --> I don't think so. If you look at nfsrv_setclient() which does Set= ClientID,=0A= it grabs the nfsv4rootfs_lock, locking out all other nfsd thre= ads.=0A= It can't acquire the lock while a "shared lock" (I called a re= fcnt) is=0A= held by any other nfsd thread and the thread doing an Open wil= l=0A= hold the shared lock (refcnt).=0A= =0A= So, I think the Open with delegation would have been issued when the=0A= ClientID was confirmed.=0A= --> Then I suspect the client did another SetClientID that put the ClientID= =0A= back to unconfirmed (the obvious one is a client reboot).=0A= --> One quirk of SetClientID/SetClientIDConfirm is that old Open/Loc= k=0A= state cannot be discarded until it is confirmed, so the old De= legations=0A= would remain on the ClientID.=0A= --> Then a client did a Getattr on the file that had the old Delegation sti= ll=0A= allocated to it.=0A= =0A= I'd definitely say that nfsrv_checkgetattr() needs to check for an unconfir= med=0A= ClientID and return without attempting a callback.=0A= --> The exclusive lock on nfsv4rootfs_lock acquired when doing SetClientID= =0A= should also guarantee that, if the ClientID is confirmed when=0A= nfsrv_checkgetattr() tests for it, it will remain that way until aft= er the=0A= callback is completed.=0A= =0A= I'll come up with a patch and stick it on phabricator, rick=0A= =0A= =0A= =0A= -Alan=0A= =0A= P.S.: stack trace=0A= =0A= kdb_backtrace=0A= vpanic=0A= panic=0A= nfsrv_docallback=0A= nfsrv_checkgetattr=0A= nfsrvd_getattr=0A= nfsrvd_dorpc=0A= nfssvc_program=0A= svc_run_internal=0A= svc_thread_start=0A= fork_exit=0A= fork_trampoline=0A= _______________________________________________=0A= freebsd-hackers@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-hackers=0A= To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"= =0A= =0A=