Date: Wed, 8 Jan 2020 19:08:07 +0200 From: Daniel Braniss <danny@cs.huji.ac.il> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Richard P Mackerras <mack63richard@gmail.com>, Adam McDougall <mcdouga9@egr.msu.edu>, "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org> Subject: Re: nfs lockd errors after NetApp software upgrade. Message-ID: <EE3DA2CA-9567-49F1-A71E-ABC706AA568E@cs.huji.ac.il> In-Reply-To: <YQBPR0101MB142781B3EF4F85A1A6ED2AE5DD290@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> References: <EBC4AD74-EC62-4C67-AB93-1AA91F662AAC@cs.huji.ac.il> <YQBPR0101MB1427411AFE335E869B9CF022DD530@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <0121E289-D2AE-44BA-ADAC-4814CAEE676F@cs.huji.ac.il> <CAGfybS-3Rvs57=oGFEfii_9a=aWxPr6dEq1Y1LqHbLXK1ZKmXA@mail.gmail.com> <YQBPR0101MB1427F9BE658B9A46C7E08335DD520@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <854B6E5A-C6BC-44B3-A656-FC9B8EF19881@cs.huji.ac.il> <YQBPR0101MB1427F445F1F1EAF382E5131ADD520@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <8770BD0D-4B72-431A-B4F5-A29D4DBA03B1@cs.huji.ac.il> <b1182bbf-fd0b-a23d-1cc4-ddf9513bcb2e@egr.msu.edu> <YQBPR0101MB1427CE52BBA32A888443BFB4DD2D0@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <8A78F67B-C244-45CF-B9BF-D7062669B33B@cs.huji.ac.il> <YQBPR0101MB1427C9D4CF8918F10B6FD400DD2C0@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <AE8F5D6B-E7DA-4AB9-B909-7D362A6A406B@cs.huji.ac.il> <YQBPR0101MB14276E7F9C127374C3E36952DD2F0@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <a33ad299-9ec6-0dc9-0926-32f20cb130c5@egr.msu.edu> <CAGfybS-a6n=Pkz8iBPj7BQ3=DbFoZRFENmy2wK3B=HzHm5dVWg@mail.gmail.com> <YQBPR0101MB142781B3EF4F85A1A6ED2AE5DD290@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
top posting NetAPP reply: =E2=80=A6 Here you can see transaction ID (0x5e15f77a) being used over port 886 = and the NFS server successfully responds. =20 4480695 2020-01-08 12:20:54 132.65.116.111 = 132.65.60.56 NLM 0x5e15f77a (1578497914) 886 = V4 UNLOCK Call (Reply In 4480696) FH:0x54b075a0 svid:13629 = pos:0-0 4480696 2020-01-08 12:20:54 132.65.60.56 = 132.65.116.111 NLM 0x5e15f77a (1578497914) 4045 = V4 UNLOCK Reply (Call In 4480695) =20 Here you see that 2 minutes later the client uses the same transaction = ID (0x5e15f77a) and the same port again, but the file handle is = different, so the client is unlocking a different file. =20 4591136 2020-01-08 12:22:54 132.65.116.111 = 132.65.60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In = 4480696) FH:0xb14b75a8 svid:13629 pos:0-0 4592588 2020-01-08 12:22:57 132.65.116.111 = 132.65.60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In = 4480696) FH:0xb14b75a8 svid:13629 pos:0-0 4598862 2020-01-08 12:23:03 132.65.116.111 = 132.65.60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In = 4480696) FH:0xb14b75a8 svid:13629 pos:0-0 4608871 2020-01-08 12:23:21 132.65.116.111 = 132.65.60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In = 4480696) FH:0xb14b75a8 svid:13629 pos:0-0 4635984 2020-01-08 12:23:59 132.65.116.111 = 132.65.60.56 NLM 0x5e15f77a (1578497914) 886 = [RPC retransmission of #4480695]V4 UNLOCK Call (Reply In = 4480696) FH:0xb14b75a8 svid:13629 pos:0-0 =20 transaction ID reuse is also seen for a number of other transaction IDs = starting at the same time. =20 Withing ONTAP 9.3 we have changed the way our Replay-Cache tracks = requests by including a checksum of the RPC request. Both in in this and = earlier releases ONTAP would cache the call in frame 4480695, but = starintg in 9.3 we then cache the checksum as part of that. =20 When the client sends the request in frame 4591136 it uses the same = transaction ID (0x5e15f77a) and same port again. Here the problem is = that we already hold a checksum in cache for the =E2=80=9Csame = transaction=E2=80=9D =E2=80=A6 this seems to be happening after the client did not receive the response = and re-transmits the request. danny > On 24 Dec 2019, at 5:02, Rick Macklem <rmacklem@uoguelph.ca> wrote: >=20 > Richard P Mackerras wrote: >> Hi, >>=20 >> We had some bully type workloads emerge when we moved a lot of block >> storage from old XIV to new all flash 3PAR. I wonder if your IMAP = issue >> might have emerged just because suddenly there was the opportunity = with all >> flash. QOS is good on 9.x ONTAP. If anyone says it=E2=80=99s not then = they last >> looked on 8.x. So I suggest you QOS the IMAP workload. >>=20 >> Nobody should be using UDP with NFS unless they have a very specific = set >> of circumstances. TCP was a real step forward. > Well, I can't argue with this, considering I did the first working = implementation > of NFS over TCP. It was actually Mike Karels that suggested I try = doing so, > There's a paper in a very old Usenix Conference Proceedings, but it is = so old > that it isn't on the Usenix web page (around 1988 in Denver, if I = recall). I don't > even have a copy myself, although I was the author. >=20 > Now, having said that, I must note that the Network Lock Manager (NLM) = and > Network Status Monitor (NSM) were not NFS. They were separate stateful > protocols (poorly designed imho) that Sun never published. >=20 > NFS as Sun designed it (NFSv2 and NFSv3) were "stateless server" = protocols, > so that they could work reliably without server crash recovery. > However, the NLM was inherently stateful, since it was dealing with = file locks. >=20 > So, you can't really lump the NLM with NFS (and you should avoid use = of the > NLM over any transport imho). >=20 > NFSv4 tackled the difficult problem of having a "stateful server" and = crash recovery, > which resulted in a much more complex protocol (compare the size of = RFC-1813 > vs RFC-5661 to get some idea of this). >=20 > rick >=20 > Cheers >=20 > Richard > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EE3DA2CA-9567-49F1-A71E-ABC706AA568E>