From owner-freebsd-stable@freebsd.org Thu Jan 9 01:03:50 2020 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 699C9222230 for ; Thu, 9 Jan 2020 01:03:50 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660061.outbound.protection.outlook.com [40.107.66.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47tSYK3KZ4z4PH0 for ; Thu, 9 Jan 2020 01:03:49 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=l2QOEG/ra8oVOqG7uYJeU6LqHbhb0V5ayPSFkfsl5CSIZiKiSu726ewxisTafU3ZC8SPhNG21QPxnAVeHaIZVjtzIuV8WyO9i5O1N+uFvuelu7qm0iwNSgQmTZHYGyJZOYYmEMlLCUJ6/VGVnaXPsLKn7ET4Z2NsnZG6gYMBphoHqyw4UNEfZbIKJD2YUgLxLqDBhkC52DfG1jPgA/S7yK8pY5s67nnoq1qrSE/bEImktvuy/+WCKGCxU3FwUFmx25qEeXGhyDbZOEFAZ0hr/re/eLIc6rhwJabMb2NOUtrCUHY8fcdt6ML9omTHRD0BZMKuSzBa0jbSK+nUWK3YOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QeRrBzztn7/zWlpJGqyQWwXQMO4htHNWRk6G3gDpRz8=; b=h2sZDNp6HPpG8WDvEQMXVmRtJR+ZexsjFpVQTFt/Q+EN5Q3Mo4eGpyI+g7pEfZB9S/DpuyMXTUOoGPfwJ6VTNgxn0Ym8c00gxfbhRsdXWZWYlnopYdZ3/O6/Vdf766mV0ZAl1dpdRRE7VmMcb/0zNUqSM0bijLMNXDwZt23bV15gXceNI+FyR6QAyPTu1ymNQMq3gyeug5mLdXOKuo7B7+/zIf/qbd0izPSKgxN1QkoS/a2IWFTdUz3UQ9PFvU5KjCid9A6hWkgS3k5W9diSyRWy4oUjDFcxZQ4MJGe0+HSdXi9p44QMM6bukgcTPjfNqEnLU6/6qnVjqN7xF0+h7A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none Received: from YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM (52.132.69.153) by YQBPR0101MB1331.CANPRD01.PROD.OUTLOOK.COM (52.132.69.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.12; Thu, 9 Jan 2020 01:03:47 +0000 Received: from YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM ([fe80::7512:8580:8d82:6c94]) by YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM ([fe80::7512:8580:8d82:6c94%6]) with mapi id 15.20.2602.016; Thu, 9 Jan 2020 01:03:47 +0000 From: Rick Macklem To: Daniel Braniss CC: Richard P Mackerras , Adam McDougall , "freebsd-stable@freebsd.org" Subject: Re: nfs lockd errors after NetApp software upgrade. Thread-Topic: nfs lockd errors after NetApp software upgrade. Thread-Index: AQHVtawq+ga5QLcdVkqBDG/GW9zFg6e/+Am+gAARTACAAANHAIAAi7Y3gACf34CAAEVO6IAABk4AgADWGACAAO1eZYAA7uGAgACmPw2AANdsAIAAsCi6gAF3uACAAC25gIAAlUcYgBiCXYCAAIL8lQ== Date: Thu, 9 Jan 2020 01:03:46 +0000 Message-ID: References: <0121E289-D2AE-44BA-ADAC-4814CAEE676F@cs.huji.ac.il> <854B6E5A-C6BC-44B3-A656-FC9B8EF19881@cs.huji.ac.il> <8770BD0D-4B72-431A-B4F5-A29D4DBA03B1@cs.huji.ac.il> <8A78F67B-C244-45CF-B9BF-D7062669B33B@cs.huji.ac.il> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 4f36a13f-bf04-4dc9-ef69-08d7949fc7ea x-ms-traffictypediagnostic: YQBPR0101MB1331: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 02778BF158 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(346002)(376002)(396003)(366004)(39860400002)(54094003)(199004)(189003)(53546011)(6506007)(7696005)(66446008)(54906003)(86362001)(2906002)(66476007)(786003)(64756008)(5660300002)(316002)(966005)(186003)(81156014)(66556008)(52536014)(4326008)(66946007)(76116006)(9686003)(55016002)(8936002)(8676002)(81166006)(33656002)(6916009)(71200400001)(478600001); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB1331; H:YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 7eeLm4rGC8hfIomVmIgHI8f4H0v/oepGiiIXDNo8ymP4FeqBdi1V/rAybmjoIcKtyLcpHG2ZQanyAcusPWO76Vd0K6kpQ3y/L0hvBqKUpdF3XHxbFyjHYn16KTHzISE63vSmjBLnL8+oLpSAAbOFDmXSnVMUbJ3lrMiI72f1lxDUv/rynKmBs1gGgrua6jxRj9/M/XsMNrXfnociHvioI/kI/PanPWRHbRnVoiie7G9iaWMytJubkgZu2X9OqByEL3wEjYgOiWpYMnoOkCjs+5Y1Axa1JktpL0MV6U9/XzdZ6njjqv3/5c6s8MaaSsgdoTagBUPMbUb3xpGCR1zUMKPME+ODBKlnlK0dg+Q/WtYgy4eqKOvt6EyAhuymFTcPLwiQqEDQb9HS/8E75OE2HjPAnw2ahi5JyjfonaIAWp1UVL5Rt+7DT4Pmdg1gmI+cwTeLAUM4xsa9YB1cHECUrhsWENeAAFqIjYAR2N0HrhyzuoVTidt4DfiWryyNeBtgvlBs9VRx06Pm6BN1trW3BSMA2aR5mZN3Vpk+0XGZvc51yOVEI8d5/qwJ7OBmOYw1 x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: 4f36a13f-bf04-4dc9-ef69-08d7949fc7ea X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Jan 2020 01:03:46.8679 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 3ZfNbvXl2lnApzuApfn4ruQtVMt41ETwdHmnuO+ninrOOHT3LeB6yOX6m9hGIMQ40cPmJsbYjBuzxZfpzZLAzg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB1331 X-Rspamd-Queue-Id: 47tSYK3KZ4z4PH0 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.61 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.68 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[uoguelph.ca]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[61.66.107.40.list.dnswl.org : 127.0.3.0]; IP_SCORE(-1.38)[ipnet: 40.64.0.0/10(-3.84), asn: 8075(-2.99), country: US(-0.05)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.64.0.0/10, country:US]; ARC_ALLOW(-1.00)[i=1]; FREEMAIL_CC(0.00)[gmail.com] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jan 2020 01:03:50 -0000 I hope you don't mind the top post, but... Here's a snippet of code from the krpc (I wasn't the author): if (stat =3D=3D RPC_TIMEDOUT) { /* * Check for async send misfeature for NLM * protocol. */ if ((rc->rc_timeout.tv_sec =3D=3D 0 && rc->rc_timeout.tv_usec =3D=3D 0) || (rc->rc_timeout.tv_sec =3D=3D -1 && utimeout.tv_sec =3D=3D 0 && utimeout.tv_usec =3D=3D 0)) { CLNT_RELEASE(client); break; } } This causes the xid to be reinitialized when a timeout occurs. The reinitialization uses __RPC_GETXID(&now) and it does an exclusive or of pid ^ time.sec ^ time.usec so it shouldn't end up the same anyhow. (Normally this initialization only occurs once, but because of the above, i= t could happen multiple times for the NLM. What does "async misfeature" mean? I have no idea. If by "transaction id" they are referring to the svid in the lock RPC messa= ge, I have no idea if it should be unique for lock ops on different files. What does the spec. say? No idea, since there is no such thing. Anyhow, using TCP will avoid the DRC and whatever the Netapp filer thinks w.r.t. the uniqueness of this field. rick ________________________________________ From: Daniel Braniss Sent: Wednesday, January 8, 2020 12:08 PM To: Rick Macklem Cc: Richard P Mackerras; Adam McDougall; freebsd-stable@freebsd.org Subject: Re: nfs lockd errors after NetApp software upgrade. top posting NetAPP reply: =85 Here you can see transaction ID (0x5e15f77a) being used over port 886 and t= he NFS server successfully responds. 4480695 2020-01-08 12:20:54 132.65.116.111 132.65= .60.56 NLM 0x5e15f77a (1578497914) 886 = V4 UNLOCK Call (Reply In 4480696) FH:0x54b075a0 svid:13629 pos:0-0 4480696 2020-01-08 12:20:54 132.65.60.56 132.65= .116.111 NLM 0x5e15f77a (1578497914) 4045 = V4 UNLOCK Reply (Call In 4480695) Here you see that 2 minutes later the client uses the same transaction ID (= 0x5e15f77a) and the same port again, but the file handle is different, so t= he client is unlocking a different file. 4591136 2020-01-08 12:22:54 132.65.116.111 132.65= .60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In 4480696) FH:0xb1= 4b75a8 svid:13629 pos:0-0 4592588 2020-01-08 12:22:57 132.65.116.111 132.65= .60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In 4480696) FH:0xb1= 4b75a8 svid:13629 pos:0-0 4598862 2020-01-08 12:23:03 132.65.116.111 132.65= .60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In 4480696) FH:0xb1= 4b75a8 svid:13629 pos:0-0 4608871 2020-01-08 12:23:21 132.65.116.111 132.65= .60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In 4480696) FH:0xb1= 4b75a8 svid:13629 pos:0-0 4635984 2020-01-08 12:23:59 132.65.116.111 132.65= .60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In 4480696) FH:0xb1= 4b75a8 svid:13629 pos:0-0 transaction ID reuse is also seen for a number of other transaction IDs sta= rting at the same time. Withing ONTAP 9.3 we have changed the way our Replay-Cache tracks requests = by including a checksum of the RPC request. Both in in this and earlier rel= eases ONTAP would cache the call in frame 4480695, but starintg in 9.3 we t= hen cache the checksum as part of that. When the client sends the request in frame 4591136 it uses the same transac= tion ID (0x5e15f77a) and same port again. Here the problem is that we alrea= dy hold a checksum in cache for the =93same transaction=94 =85 this seems to be happening after the client did not receive the response an= d re-transmits the request. danny On 24 Dec 2019, at 5:02, Rick Macklem > wrote: Richard P Mackerras wrote: Hi, We had some bully type workloads emerge when we moved a lot of block storage from old XIV to new all flash 3PAR. I wonder if your IMAP issue might have emerged just because suddenly there was the opportunity with all flash. QOS is good on 9.x ONTAP. If anyone says it=92s not then they last looked on 8.x. So I suggest you QOS the IMAP workload. Nobody should be using UDP with NFS unless they have a very specific set of circumstances. TCP was a real step forward. Well, I can't argue with this, considering I did the first working implemen= tation of NFS over TCP. It was actually Mike Karels that suggested I try doing so, There's a paper in a very old Usenix Conference Proceedings, but it is so o= ld that it isn't on the Usenix web page (around 1988 in Denver, if I recall). = I don't even have a copy myself, although I was the author. Now, having said that, I must note that the Network Lock Manager (NLM) and Network Status Monitor (NSM) were not NFS. They were separate stateful protocols (poorly designed imho) that Sun never published. NFS as Sun designed it (NFSv2 and NFSv3) were "stateless server" protocols, so that they could work reliably without server crash recovery. However, the NLM was inherently stateful, since it was dealing with file lo= cks. So, you can't really lump the NLM with NFS (and you should avoid use of the NLM over any transport imho). NFSv4 tackled the difficult problem of having a "stateful server" and crash= recovery, which resulted in a much more complex protocol (compare the size of RFC-181= 3 vs RFC-5661 to get some idea of this). rick Cheers Richard _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"